Merrimack Station AR-1165 


U.S.-Canada Power System Outage Task Force 


Final Report on the 
August 14, 2003 Blackout 

in the 

United States and Canada 

Causes and 
Recommendations 



Canada 









U.S.-Canada Power System Outage Task Force 


Final Report on the 
August 14, 2003 Blackout 

in the 

United States and Canada: 

Causes and 
Recommendations 



April 2004 



U.S.-Canada Power System Outage Task Force 



March 31, 2004 

Dear Mr. President and Prime Minister: 

We are pleased to submit the Final Report of the U.S.-Canada Power System Outage 
Task Force. As directed by you, the Task Force has completed a thorough investigation 
of the causes of the August 14, 2003 blackout and has recommended actions to minimize 
the likelihood and scope of similar events in the future. 

The report makes clear that this blackout could have been prevented and that immediate 
actions must be taken in both the United States and Canada to ensure that our electric 
system is more reliable. First and foremost, compliance with reliability rules must be 
made mandatory with substantial penalties for non-compliance. 

We expect continued collaboration between our two countries to implement this report’s 
recommendations. Failure to implement the recommendations would threaten the 
reliability of the electricity supply that is critical to the economic, energy and national 
security of our countries. 

The work of the Task Force has been an outstanding example of close and effective 
cooperation between the U.S. and Canadian governments. Such work will continue as we 
strive to implement the Final Report’s recommendations. We resolve to work in 
cooperation with Congress, Parliament, states, provinces and stakeholders to ensure that 
North America’s electric grid is robust and reliable. 

We would like to specifically thank the members of the Task Force and its Working 
Groups for their efforts and support as we investigated the blackout and moved toward 
completion of the Final Report. All involved have made valuable contributions. We 
submit this report with optimism that its recommendations will result in better electric 
service for the people of both our nations. 


Sincerely, 



U.S. Secretary of Energy Minister of Natural Resources Canada 
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1. Introduction 


On August 14, 2003, large portions of the Midwest 
and Northeast United States and Ontario, Canada, 
experienced an electric power blackout. The out¬ 
age affected an area with an estimated 50 million 
people and 61,800 megawatts (MW) of electric 
load in the states of Ohio, Michigan, Pennsylva¬ 
nia, New York, Vermont, Massachusetts, Connect¬ 
icut, New Jersey and the Canadian province of 
Ontario. The blackout began a few minutes after 
4:00 pm Eastern Daylight Time (16:00 EDT), and 
power was not restored for 4 days in some parts of 
the United States. Parts of Ontario suffered rolling 
blackouts for more than a week before full power 
was restored. Estimates of total costs in the United 
States range between $4 billion and $10 billion 
(U.S. dollars). 1 In Canada, gross domestic product 
was down 0.7% in August, there was a net loss of 
18.9 million work hours, and manufacturing ship¬ 
ments in Ontario were down $2.3 billion (Cana¬ 
dian dollars). 2 

On August 15, President George W. Bush and 
then-Prime Minister Jean Chretien directed that a 
joint U.S.-Canada Power System Outage Task 
Force be established to investigate the causes of 
the blackout and ways to reduce the possibility of 
future outages. They named U.S. Secretary of 
Energy Spencer Abraham and Herb Dhaliwal, 
Minister of Natural Resources, Canada, to chair 
the joint Task Force. (Mr. Dhaliwal was later suc¬ 
ceeded by Mr. John Efford as Minister of Natural 
Resources and as co-chair of the Task Force.) 
Three other U.S. representatives and three other 
Canadian representatives were named to the 
Task Force. The U.S. members were Tom Ridge, 
Secretary of Homeland Security; Pat Wood III, 
Chairman of the Federal Energy Regulatory Com¬ 
mission; and Nils Diaz, Chairman of the Nuclear 
Regulatory Commission. The Canadian members 
were Deputy Prime Minister John Manley, later 
succeeded by Deputy Prime Minister Anne 
McLellan; Kenneth Vollman, Chairman of the 
National Energy Board; and Linda J. Keen, Presi¬ 
dent and CEO of the Canadian Nuclear Safety 
Commission. 


The Task Force divided its work into two phases: 

♦ Phase I: Investigate the outage to determine its 
causes and why it was not contained. 

♦ Phase II: Develop recommendations to reduce 
the possibility of future outages and reduce the 
scope of any that occur. 

The Task Force created three Working Groups to 
assist in both phases of its work—an Electric Sys¬ 
tem Working Group (ESWG), a Nuclear Working 
Group (NWG), and a Security Working Group 
(SWG). The Working Groups were made up of 
state and provincial representatives, federal 
employees, and contractors working for the U.S. 
and Canadian government agencies represented 
on the Task Force. 

The Task Force published an Interim Report on 
November 19, 2003, summarizing the facts that 
the bi-national investigation found regarding the 
causes of the blackout on August 14, 2003. After 
November 19, the Task Force’s technical investi¬ 
gation teams pursued certain analyses that were 
not complete in time for publication in the Interim 
Report. The Working Groups focused on the draft¬ 
ing of recommendations for the consideration of 
the Task Force to prevent future blackouts and 
reduce the scope of any that nonetheless occur. In 
drafting these recommendations, the Working 
Groups drew substantially on information and 
insights from the investigation teams’ additional 
analyses, and on inputs received at three public 
meetings (in Cleveland, New York City, and 
Toronto) and two technical conferences (in Phila¬ 
delphia and Toronto). They also drew on com¬ 
ments filed electronically by interested parties on 
websites established for this purpose by the 
U.S. Department of Energy and Natural Resources 
Canada. 

Although this Final Report presents some new 
information about the events and circumstances 
before the start of the blackout and additional 
detail concerning the cascade stage of the black¬ 
out, none of the comments received or additional 
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analyses performed by the Task Force’s investiga¬ 
tors have changed the validity of the conclusions 
published in the Interim Report. This report, 
however, presents findings concerning additional 
violations of reliability requirements and institu¬ 
tional and performance deficiencies beyond those 
identified in the Interim Report. 

The organization of this Final Report is similar to 
that of the Interim Report, and it is intended to 
update and supersede the Interim Report. It is 
divided into ten chapters, including this introduc¬ 
tory chapter: 

♦ Chapter 2 provides an overview of the institu¬ 
tional framework for maintaining and ensuring 
the reliability of the bulk power system in North 
America, with particular attention to the roles 
and responsibilities of several types of reliabil¬ 
ity-related organizations. 

♦ Chapter 3 identifies the causes of the blackout 
and identifies failures to perform effectively 
relative to the reliability policies, guidelines, 
and standards of the North American Electric 
Reliability Council (NERC) and, in some cases, 
deficiencies in the standards themselves. 

♦ Chapter 4 discusses conditions on the regional 
power system on and before August 14 and 
identifies conditions and failures that did and 
did not contribute to the blackout. 

♦ Chapter 5 describes the afternoon of August 14, 
starting from normal operating conditions, then 
going into a period of abnormal but still poten¬ 
tially manageable conditions, and finally into 
an uncontrollable blackout in northern Ohio. 

♦ Chapter 6 provides details on the cascade phase 
of the blackout as it spread in Ohio and then 
across the Northeast, and explains why the sys¬ 
tem performed as it did. 

♦ Chapter 7 compares the August 14, 2003, black¬ 
out with previous major North American power 
outages. 

♦ Chapter 8 examines the performance of the 
nuclear power plants affected by the August 14 
outage. 

♦ Chapter 9 addresses issues related to physical 
and cyber security associated with the outage. 


♦ Chapter 10 presents the Task Force’s recom¬ 
mendations for preventing future blackouts and 
reducing the scope of any that occur. 

Chapter 10 includes a total of 46 recommenda¬ 
tions, but the single most important of them is that 
the U.S. Congress should enact the reliability pro¬ 
visions in H.R. 6 and S. 2095 to make compliance 
with reliability standards mandatory and enforce¬ 
able. If that could be done, many of the other rec¬ 
ommended actions could be accomplished readily 
in the course of implementing the legislation. An 
overview of the recommendations (by titles only) 
is provided on pages 3 and 4. 

Chapter 2 is very little changed from the version 
published in the Interim Report. Chapter 3 is new 
to this Final Report. Chapters 4, 5, and 6 have been 
revised and expanded from the corresponding 
chapters (3, 4, and 5) of the Interim Report. Chap¬ 
ters 7, 8, and 9 are only slightly changed from 
Chapters 6, 7, and 8 of the Interim Report. The 
Interim Report had no counterpart to Chapter 10. 

This report also includes seven appendixes: 

♦ Appendix A lists the members of the Task Force 
and the three working groups. 

♦ Appendix B describes the Task Force’s investi¬ 
gative process for developing the Task Force’s 
recommendations. 

♦ Appendix C lists the parties who either com¬ 
mented on the Interim Report, provided sugges¬ 
tions for recommendations, or both. 

♦ Appendix D reproduces a document released on 
February 10, 2004 by NERC, describing its 
actions to prevent and mitigate the impacts of 
future cascading blackouts. 

♦ Appendix E is a list of electricity acronyms. 

♦ Appendix F provides a glossary of electricity 
terms. 

♦ Appendix G contains transmittal letters perti¬ 
nent to this report from the three Working 
Groups. 
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Overview of Task Force Recommendations: Titles Only 

Group I. Institutional Issues Related to Reliability 

1. Make reliability standards mandatory and enforceable, with penalties for noncompliance. 

2. Develop a regulator-approved funding mechanism for NERC and the regional reliability councils, 
to ensure their independence from the parties they oversee. 

3. Strengthen the institutional framework for reliability management in North America. 

4. Clarify that prudent expenditures and investments for bulk system reliability (including invest¬ 
ments in new technologies) will be recoverable through transmission rates. 

5. Track implementation of recommended actions to improve reliability. 

6. FERC should not approve the operation of new RTOs or ISOs until they have met minimum 
functional requirements. 

7. Require any entity operating as part of the bulk power system to be a member of a regional reli¬ 
ability council if it operates within the council’s footprint. 

8. Shield operators who initiate load shedding pursuant to approved guidelines from liability or 
retaliation. 

9. Integrate a “reliability impact” consideration into the regulatory decision-making process. 

10. Establish an independent source of reliability performance information. 

11. Establish requirements for collection and reporting of data needed for post-blackout analyses. 

12. Commission an independent study of the relationships among industry restructuring, competi¬ 
tion, and reliability. 

13. DOE should expand its research programs on reliability-related tools and technologies. 

14. Establish a standing framework for the conduct of future blackout and disturbance 
investigations. 

Group II. Support and Strengthen NERC’s Actions of February 10, 2004 

15. Correct the direct causes of the August 14, 2003 blackout. 

16. Establish enforceable standards for maintenance of electrical clearances in right-of-way areas. 

17. Strengthen the NERC Compliance Enforcement Program. 

18. Support and strengthen NERC’s Reliability Readiness Audit Program. 

19. Improve near-term and long-term training and certification requirements for operators, reliability 
coordinators, and operator support staff. 

20. Establish clear definitions for normal, alert and emergency operational system conditions. Clarify 
roles, responsibilities, and authorities of reliability coordinators and control areas under each 
condition. 

21. Make more effective and wider use of system protection measures. 

22. Evaluate and adopt better real-time tools for operators and reliability coordinators. 

23. Strengthen reactive power and voltage control practices in all NERC regions. 

24. Improve quality of system modeling data and data exchange practices. 

25. NERC should reevaluate its existing reliability standards development process and accelerate the 
adoption of enforceable standards. 

26. Tighten communications protocols, especially for communications during alerts and emergen¬ 
cies. Upgrade communication system hardware where appropriate. 

27. Develop enforceable standards for transmission line ratings. 

28. Require use of time-synchronized data recorders. 

29. Evaluate and disseminate lessons learned during system restoration. 

30. Clarify criteria for identification of operationally critical facilities, and improve dissemination of 
updated information on unplanned outages. 

31. Clarify that the transmission loading relief (TLR) process should not be used in situations involv¬ 
ing an actual violation of an Operating Security Limit. Streamline the TLR process. 

(continued on page 142) 
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Overview of Task Force Recommendations: Titles Only (Continued) 

Group III. Physical and Cyber Security of North American Bulk Power Systems 

32. Implement NERC IT standards. 

33. Develop and deploy IT management procedures. 

34. Develop corporate-level IT security governance and strategies. 

35. Implement controls to manage system health, network monitoring, and incident management. 

36. Initiate U.S.-Canada risk management study. 

37. Improve IT forensic and diagnostic capabilities. 

38. Assess IT risk and vulnerability at scheduled intervals. 

39. Develop capability to detect wireless and remote wireline intrusion and surveillance. 

40. Control access to operationally sensitive equipment. 

41. NERC should provide guidance on employee background checks. 

42. Confirm NERC ES-ISAC as the central point for sharing security information and analysis. 

43. Establish clear authority for physical and cyber security. 

44. Develop procedures to prevent or mitigate inappropriate disclosure of information. 

Group IV. Canadian Nuclear Power Sector 

45. The Task Force recommends that the Canadian Nuclear Safety Commission request Ontario 
Power Generation and Bruce Power to review operating procedures and operator training associ¬ 
ated with the use of adjuster rods. 

46. The Task Force recommends that the Canadian Nuclear Safety Commission purchase and install 
backup generation equipment. 


Endnotes 

1 See “The Economic Impacts of the August 2003 Blackout,’ 
Electric Consumer Research Council (ELCON), February 2 
2004. 


2 Statistics Canada, Gross Domestic Product by Industry, 
August 2003, Catalogue No. 15-001; September 2003 Labour 
Force Survey; Monthly Survey of Manufacturing, August 2003, 
Catalogue No. 31-001. 
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2. Overview of the North American Electric Power 
System and Its Reliability Organizations 


The North American Power Grid 
Is One Large, Interconnected 
Machine 

The North American electricity system is one of 
the great engineering achievements of the past 100 
years. This electricity infrastructure represents 
more than $1 trillion (U.S.) in asset value, more 
than 200,000 miles—or 320,000 kilometers (km) 
of transmission lines operating at 230,000 volts 
and greater, 950,000 megawatts of generating 
capability, and nearly 3,500 utility organizations 
serving well over 100 million customers and 283 
million people. 

Modem society has come to depend on reliable 
electricity as an essential resource for national 
security; health and welfare; communications; 
finance; transportation; food and water supply; 
heating, cooling, and lighting; computers and 
electronics; commercial enterprise; and even 
entertainment and leisure—in short, nearly all 
aspects of modern life. Customers have grown to 
expect that electricity will almost always be avail¬ 
able when needed at the flick of a switch. Most 
customers have also experienced local outages 
caused by a car hitting a power pole, a construc¬ 
tion crew accidentally damaging a cable, or a 


lightning storm. What is not expected is the occur¬ 
rence of a massive outage on a calm, warm day. 
Widespread electrical outages, such as the one 
that occurred on August 14, 2003, are rare, but 
they can happen if multiple reliability safeguards 
break down. 

Providing reliable electricity is an enormously 
complex technical challenge, even on the most 
routine of days. It involves real-time assessment, 
control and coordination of electricity production 
at thousands of generators, moving electricity 
across an interconnected network of transmission 
lines, and ultimately delivering the electricity to 
millions of customers by means of a distribution 
network. 

As shown in Figure 2.1, electricity is produced at 
lower voltages (10,000 to 25,000 volts) at genera¬ 
tors from various fuel sources, such as nuclear, 
coal, oil, natural gas, hydro power, geothermal, 
photovoltaic, etc. Some generators are owned by 
the same electric utilities that serve the end-use 
customer; some are owned by independent power 
producers (IPPs); and others are owned by cus¬ 
tomers themselves—particularly large industrial 
customers. 

Electricity from generators is “stepped up” to 
higher voltages for transportation in bulk over 


Figure 2.1. Basic Structure of the Electric System 


Color Key: 

Blue: 

Transmission 

Green: 

Distribution 

Black: 

Generation 


Transmission Lines 
765, 500, 345, 230, and 138 kV 



Substation 
Step-Down 
T ransformer 


Generating Station Transmission 

Generator Step Customer 

Up Transformer 138kV or 230kV 



a a 


□no 


Subtransmission 
Customer 
26kV and 69kV 


Primary Customer 
13kV and 4kV 


Secondary Customer 
120V and 240V 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 


5 











































transmission lines. Operating the transmission 
lines at high voltage (i.e., 230,000 to 765,000 volts) 
reduces the losses of electricity from conductor 
heating and allows power to be shipped economi¬ 
cally over long distances. Transmission lines are 
interconnected at switching stations and substa¬ 
tions to form a network of lines and stations called 
a power “grid.” Electricity flows through the inter¬ 
connected network of transmission lines from the 
generators to the loads in accordance with the 
laws of physics—along “paths of least resistance,” 
in much the same way that water flows through a 
network of canals. When the power arrives near a 
load center, it is “stepped down” to lower voltages 
for distribution to customers. The bulk power sys¬ 
tem is predominantly an alternating current (AC) 
system, as opposed to a direct current (DC) sys¬ 
tem, because of the ease and low cost with which 
voltages in AC systems can be converted from one 
level to another. Some larger industrial and com¬ 
mercial customers take service at intermediate 
voltage levels (12,000 to 115,000 volts), but most 
residential customers take their electrical service 
at 120 and 240 volts. 

While the power system in North America is com¬ 
monly referred to as “the grid,” there are actually 
three distinct power grids or “interconnections” 
(Figure 2.2). The Eastern Interconnection includes 
the eastern two-thirds of the continental United 
States and Canada from Saskatchewan east to the 
Maritime Provinces. The Western Interconnection 
includes the western third of the continental 
United States (excluding Alaska), the Canadian 
provinces of Alberta and British Columbia, and a 
portion of Baja California Norte, Mexico. The third 
interconnection comprises most of the state of 
Texas. The three interconnections are electrically 


Figure 2.2. North American Interconnections 



independent from each other except for a few 
small direct current (DC) ties that link them. 
Within each interconnection, electricity is pro¬ 
duced the instant it is used, and flows over virtu¬ 
ally all transmission lines from generators to 
loads. 

The northeastern portion of the Eastern Intercon¬ 
nection (about 10 percent of the interconnection’s 
total load) was affected by the August 14 blackout. 
The other two interconnections were not 
affected. 1 

Planning and Reliable Operation 
of the Power Grid Are Technically 
Demanding 

Reliable operation of the power grid is complex 
and demanding for two fundamental reasons: 

♦ First, electricity flows at close to the speed of 
light (186,000 miles per second or 297,600 
km/sec) and is not economically storable in 
large quantities. Therefore electricity must be 
produced the instant it is used. 

♦ Second, without the use of control devices too 
expensive for general use, the flow of alternat¬ 
ing current (AC) electricity cannot be controlled 
like a liquid or gas by opening or closing a valve 
in a pipe, or switched like calls over a long¬ 
distance telephone network. 2 Electricity flows 
freely along all available paths from the genera¬ 
tors to the loads in accordance with the laws of 
physics—dividing among all connected flow 
paths in the network, in inverse proportion to 
the impedance (resistance plus reactance) on 
each path. 

Maintaining reliability is a complex enterprise 
that requires trained and skilled operators, sophis¬ 
ticated computers and communications, and care¬ 
ful planning and design. The North American 
Electric Reliability Council (NERC) and its ten 
Regional Reliability Councils have developed sys¬ 
tem operating and planning standards for ensur¬ 
ing the reliability of a transmission grid that are 
based on seven key concepts: 

♦ Balance power generation and demand 
continuously. 

♦ Balance reactive power supply and demand to 
maintain scheduled voltages. 

♦ Monitor flows over transmission lines and other 
facilities to ensure that thermal (heating) limits 
are not exceeded. 
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♦ Keep the system in a stable condition. 

♦ Operate the system so that it remains in a reli¬ 
able condition even if a contingency occurs, 
such as the loss of a key generator or transmis¬ 
sion facility (the “N-l criterion”). 

♦ Plan, design, and maintain the system to oper¬ 
ate reliably. 

♦ Prepare for emergencies. 

These seven concepts are explained in more detail 

below. 

1. Balance power generation and demand contin¬ 
uously. To enable customers to use as much 
electricity as they wish at any moment, produc¬ 
tion by the generators must be scheduled or 
“dispatched” to meet constantly changing 
demands, typically on an hourly basis, and then 
fine-tuned throughout the hour, sometimes 
through the use of automatic generation con¬ 
trols to continuously match generation to actual 
demand. Demand is somewhat predictable, 
appearing as a daily demand curve—in the 
summer, highest during the afternoon and eve¬ 
ning and lowest in the middle of the night, and 
higher on weekdays when most businesses are 
open (Figure 2.3). 

Failure to match generation to demand causes 
the frequency of an AC power system (nomi¬ 
nally 60 cycles per second or 60 Hertz) to 
increase (when generation exceeds demand) or 
decrease (when generation is less than demand) 
(Figure 2.4). Random, small variations in fre¬ 
quency are normal, as loads come on and off 
and generators modify their output to follow the 
demand changes. However, large deviations in 
frequency can cause the rotational speed of gen¬ 
erators to fluctuate, leading to vibrations that 
can damage generator turbine blades and other 
equipment. Extreme low frequencies can trigger 

Figure 2.3. PJM Load Curve, August 18-24, 2003 



automatic under-frequency “load shedding,” 
which takes blocks of customers off-line in 
order to prevent a total collapse of the electric 
system. As will be seen later in this report, such 
an imbalance of generation and demand can 
also occur when the system responds to major 
disturbances by breaking into separate 
“islands”; any such island may have an excess 
or a shortage of generation, compared to 
demand within the island. 

2. Balance reactive power supply and demand to 
maintain scheduled voltages. Reactive power 
sources, such as capacitor banks and genera¬ 
tors, must be adjusted during the day to main¬ 
tain voltages within a secure range pertaining to 
all system electrical equipment (stations, trans¬ 
mission lines, and customer equipment). Most 
generators have automatic voltage regulators 
that cause the reactive power output of genera¬ 
tors to increase or decrease to control voltages to 
scheduled levels. Low voltage can cause electric 
system instability or collapse and, at distribu¬ 
tion voltages, can cause damage to motors and 
the failure of electronic equipment. High volt¬ 
ages can exceed the insulation capabilities of 
equipment and cause dangerous electric arcs 
(“flashovers”). 

3. Monitor flows over transmission lines and 
other facilities to ensure that thermal (heating) 
limits are not exceeded. The dynamic interac¬ 
tions between generators and loads, combined 
with the fact that electricity flows freely across 
all interconnected circuits, mean that power 
flow is ever-changing on transmission and dis¬ 
tribution lines. All lines, transformers, and 
other equipment carrying electricity are heated 
by the flow of electricity through them. The 


Figure 2.4. Normal and Abnormal Frequency 
Ranges 
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Local Supplies of Reactive Power Are Essential to Maintaining Voltage Stability 


A generator typically produces some mixture of 
“real” and “reactive” power, and the balance 
between them can be adjusted at short notice to 
meet changing conditions. Real power, measured 
in watts, is the form of electricity that powers 
equipment. Reactive power, a characteristic of 
AC systems, is measured in volt-amperes reac¬ 
tive (VAr), and is the energy supplied to create or 
be stored in electric or magnetic fields in and 
around electrical equipment. Reactive power is 
particularly important for equipment that relies 
on magnetic fields for the production of induced 
electric currents (e.g., motors, transformers, 
pumps, and air conditioning.) Transmission 


lines both consume and produce reactive power. 
At light loads they are net producers, and at 
heavy loads, they are heavy consumers. Reactive 
power consumption by these facilities or devices 
tends to depress transmission voltage, while its 
production (by generators) or injection (from 
storage devices such as capacitors) tends to sup¬ 
port voltage. Reactive power can be transmitted 
only over relatively short distances during heavy 
load conditions. If reactive power cannot be sup¬ 
plied promptly and in sufficient quantity, volt¬ 
ages decay, and in extreme cases a “voltage 
collapse” may result. 


flow must be limited to avoid overheating and 
damaging the equipment. In the case of over¬ 
head power lines, heating also causes the metal 
conductor to stretch or expand and sag closer to 
ground level. Conductor heating is also affected 
by ambient temperature, wind, and other fac¬ 
tors. Flow on overhead lines must be limited to 
ensure that the line does not sag into obstruc¬ 
tions below such as trees or telephone lines, or 
violate the minimum safety clearances between 
the energized lines and other objects. (A short 
circuit or “flashover”—which can start fires or 
damage equipment—can occur if an energized 
line gets too close to another object). Most trans¬ 
mission lines, transformers and other current- 
carrying devices are monitored continuously to 
ensure that they do not become overloaded or 
violate other operating constraints. Multiple 
ratings are typically used, one for normal condi¬ 
tions and a higher rating for emergencies. The 
primary means of limiting the flow of power on 
transmission lines is to adjust selectively the 
output of generators. 

4. Keep the system in a stable condition. Because 
the electric system is interconnected and 
dynamic, electrical stability limits must be 
observed. Stability problems can develop very 
quickly—in just a few cycles (a cycle is l/60th of 
a second)—or more slowly, over seconds or 
minutes. The main concern is to ensure that 
generation dispatch and the resulting power 
flows and voltages are such that the system is 
stable at all times. (As will be described later in 
this report, part of the Eastern Interconnection 
became unstable on August 14, resulting in a 
cascading outage over a wide area.) Stability 


limits, like thermal limits, are expressed as a 
maximum amount of electricity that can be 
safely transferred over transmission lines. 

There are two types of stability limits: (1) Volt¬ 
age stability limits are set to ensure that the 
unplanned loss of a line or generator (which 
may have been providing locally critical reac¬ 
tive power support, as described previously) 
will not cause voltages to fall to dangerously 
low levels. If voltage falls too low, it begins to 
collapse uncontrollably, at which point auto¬ 
matic relays either shed load or trip generators 
to avoid damage. (2) Power (angle) stability lim¬ 
its are set to ensure that a short circuit or an 
unplanned loss of a line, transformer, or genera¬ 
tor will not cause the remaining generators and 
loads being served to lose synchronism with 
one another. (Recall that all generators and 
loads within an interconnection must operate at 
or very near a common 60 Hz frequency.) Loss 
of synchronism with the common frequency 
means generators are operating out-of-step with 
one another. Even modest losses of synchro¬ 
nism can result in damage to generation equip¬ 
ment. Under extreme losses of synchronism, 
the grid may break apart into separate electrical 
islands; each island would begin to maintain its 
own frequency, determined by the load/genera¬ 
tion balance within the island. 

5. Operate the system so that it remains in a reli¬ 
able condition even if a contingency occurs, 
such as the loss of a key generator or transmis¬ 
sion facility (the “N minus 1 criterion”). The 

central organizing principle of electricity reli¬ 
ability management is to plan for the unex¬ 
pected. The unique characteristics of electricity 


8 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 




mean that problems, when they arise, can 
spread and escalate very quickly if proper safe¬ 
guards are not in place. Accordingly, through 
years of experience, the industry has developed 
a network of defensive strategies for maintain¬ 
ing reliability based on the assumption that 
equipment can and will fail unexpectedly upon 
occasion. 

This principle is expressed by the requirement 
that the system must be operated at all times to 
ensure that it will remain in a secure condition 
(generally within emergency ratings for current 
and voltage and within established stability 
limits) following the loss of the most important 
generator or transmission facility (a “worst sin¬ 
gle contingency”). This is called the “N-l crite¬ 
rion.” In other words, because a generator or 
line trip can occur at any time from random fail¬ 
ure, the power system must be operated in a 
preventive mode so that the loss of the most 
important generator or transmission facility 


does not jeopardize the remaining facilities in 
the system by causing them to exceed their 
emergency ratings or stability limits, which 
could lead to a cascading outage. 

Further, when a contingency does occur, the 
operators are required to identify and assess 
immediately the new worst contingencies, 
given the changed conditions, and promptly 
make any adjustments needed to ensure that if 
one of them were to occur, the system would 
still remain operational and safe. NERC operat¬ 
ing policy requires that the system be restored 
as soon as practical but within no more than 30 
minutes to compliance with normal limits, and 
to a condition where it can once again with¬ 
stand the next-worst single contingency with¬ 
out violating thermal, voltage, or stability 
limits. A few areas of the grid are operated to 
withstand the concurrent loss of two or more 
facilities (i.e., “N-2”). This may be done, for 
example, as an added safety measure to protect 


Why Don’t More Blackouts Happen? 

Given the complexity of the bulk power system 
and the day-to-day challenges of operating it, 
there are a lot of things that could go wrong— 
which makes it reasonable to wonder why so few 
large outages occur. 

Large outages or blackouts are infrequent 
because responsible system owners and opera¬ 
tors practice “defense in depth,” meaning that 
they protect the bulk power system through lay¬ 
ers of safety-related practices and equipment. 
These include: 

1. A range of rigorous planning and operating 
studies, including long-term assessments, 
year-ahead, season-ahead, week-ahead, day- 
ahead, hour-ahead, and real-time operational 
contingency analyses. Planners and operators 
use these to evaluate the condition of the sys¬ 
tem, anticipate problems ranging from likely 
to low probability but high consequence, and 
develop a good understanding of the limits and 
rules for safe, secure operation under such 
contingencies. If multiple contingencies occur 
in a single area, they are likely to be interde¬ 
pendent rather than random, and should have 
been anticipated in planning studies. 

2. Preparation for the worst case. The operating 
rule is to always prepare the system to be safe 


in the face of the worst single contingency that 
could occur relative to current conditions, 
which means that the system is also prepared 
for less adverse contingencies. 

3. Quick response capability. Most potential 
problems first emerge as a small, local situa¬ 
tion. When a small, local problem is handled 
quickly and responsibly using NERC operating 
practices—particularly to return the system to 
N-l readiness within 30 minutes or less— 
the problem can usually be resolved and 
contained before it grows beyond local 
proportions. 

4. Maintain a surplus of generation and trans¬ 
mission. This provides a cushion in day-to- 
day operations, and helps ensure that small 
problems don’t become big problems. 

5. Have backup capabilities for all critical func¬ 
tions. Most owners and operators maintain 
backup capabilities—such as redundant 
equipment already on-line (from generation in 
spinning reserve and transmission operating 
margin and limits to computers and other 
operational control systems)—and keep an 
inventory of spare parts to be able to handle an 
equipment failure. 
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a densely populated metropolitan area or when 
lines share a common structure and could be 
affected by a common failure mode, e.g., a sin¬ 
gle lightning strike. 

6. Plan, design, and maintain the system to oper¬ 
ate reliably. Reliable power system operation 
requires far more than monitoring and control¬ 
ling the system in real-time. Thorough plan¬ 
ning, design, maintenance, and analysis are 
required to ensure that the system can be oper¬ 
ated reliably and within safe limits. Short-term 
planning addresses day-ahead and week-ahead 
operations planning; long-term planning 
focuses on providing adequate generation 
resources and transmission capacity to ensure 
that in the future the system will be able to 
withstand severe contingencies without experi¬ 
encing widespread, uncontrolled cascading 
outages. 

A utility that serves retail customers must esti¬ 
mate future loads and, in some cases, arrange 
for adequate sources of supplies and plan ade¬ 
quate transmission and distribution infrastruc¬ 
ture. NERC planning standards identify a range 
of possible contingencies and set corresponding 
expectations for system performance under sev¬ 
eral categories of possible events, ranging from 
everyday “probable” events to “extreme” events 
that may involve substantial loss of customer 
load and generation in a widespread area. NERC 
planning standards also address requirements 
for voltage support and reactive power, distur¬ 
bance monitoring, facility ratings, system mod¬ 
eling and data requirements, system protection 
and control, and system restoration. 

7. Prepare for emergencies. System operators are 
required to take the steps described above to 
plan and operate a reliable power system, but 
emergencies can still occur because of external 
factors such as severe weather, operator error, 
or equipment failures that exceed planning, 
design, or operating criteria. For these rare 
events, the operating entity is required to have 
emergency procedures covering a credible 
range of emergency scenarios. Operators must 
be trained to recognize and take effective action 
in response to these emergencies. To deal with a 
system emergency that results in a blackout, 
such as the one that occurred on August 14, 
2003, there must be procedures and capabilities 
to use “black start” generators (capable of 
restarting with no external power source) and to 
coordinate operations in order to restore the 


system as quickly as possible to a normal and 
reliable condition. 

Reliability Organizations Oversee 
Grid Reliability in North America 

NERC is a non-governmental entity whose mis¬ 
sion is to ensure that the bulk electric system in 
North America is reliable, adequate and secure. 
The organization was established in 1968, as a 
result of the Northeast blackout in 1965. Since its 
inception, NERC has operated as a voluntary orga¬ 
nization, relying on reciprocity, peer pressure and 
the mutual self-interest of all those involved to 
ensure compliance with reliability requirements. 
An independent board governs NERC. 

To fulfill its mission, NERC: 

♦ Sets standards for the reliable operation and 
planning of the bulk electric system. 

♦ Monitors and assesses compliance with stan¬ 
dards for bulk electric system reliability. 

♦ Provides education and training resources to 
promote bulk electric system reliability. 

♦ Assesses, analyzes and reports on bulk electric 
system adequacy and performance. 

♦ Coordinates with regional reliability councils 
and other organizations. 

♦ Coordinates the provision of applications 
(tools), data and services necessary to support 
the reliable operation and planning of the bulk 
electric system. 

♦ Certifies reliability service organizations and 
personnel. 

♦ Coordinates critical infrastructure protection of 
the bulk electric system. 

♦ Enables the reliable operation of the intercon¬ 
nected bulk electric system by facilitating infor¬ 
mation exchange and coordination among 
reliability service organizations. 

Recent changes in the electricity industry have 
altered many of the traditional mechanisms, 
incentives and responsibilities of the entities 
involved in ensuring reliability, to the point that 
the voluntary system of compliance with reliabil¬ 
ity standards is generally recognized as not ade¬ 
quate to current needs. 3 NERC and many other 
electricity organizations support the development 
of a new mandatory system of reliability standards 
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and compliance, backstopped in the United States 
by the Federal Energy Regulatory Commission. 
This will require federal legislation in the United 
States to provide for the creation of a new electric 
reliability organization with the statutory author¬ 
ity to enforce compliance with reliability stan¬ 
dards among all market participants. Appropriate 
government entities in Canada and Mexico are 
prepared to take similar action, and some have 
already done so. In the meantime, NERC encour¬ 
ages compliance with its reliability standards 
through an agreement with its members. 

NERC’s members are ten regional reliability 
councils. (See Figure 2.5 for a map showing the 
locations and boundaries of the regional councils.) 
In turn, the regional councils have broadened 
their membership to include all segments of the 
electric industry: investor-owned utilities; federal 
power agencies; rural electric cooperatives; state, 
municipal and provincial utilities; independent 
power producers; power marketers; and end-use 
customers. Collectively, the members of the NERC 
regions account for virtually all the electricity sup¬ 
plied in the United States, Canada, and a portion 
of Baja California Norte, Mexico. The ten regional 
councils jointly fund NERC and adapt NERC 
standards to meet the needs of their regions. The 
August 14 blackout affected three NERC regional 
reliability councils—East Central Area Reliability 
Coordination Agreement (ECAR), Mid-Atlantic 
Area Council (MAAC), and Northeast Power Coor¬ 
dinating Council (NPCC). 

“Control areas” are the primary operational enti¬ 
ties that are subject to NERC and regional council 
standards for reliability. A control area is a geo¬ 
graphic area within which a single entity, Inde¬ 
pendent System Operator (ISO), or Regional 
Transmission Organization (RTO) balances gener¬ 
ation and loads in real time to maintain reliable 
operation. Control areas are linked with each 
other through transmission interconnection tie 
lines. Control area operators control generation 
directly to maintain their electricity interchange 
schedules with other control areas. They also 
operate collectively to support the reliability of 
their interconnection. As shown in Figure 2.6, 
there are approximately 140 control areas in North 
America. The control area dispatch centers have 
sophisticated monitoring and control systems and 
are staffed 24 hours per day, 365 days per year. 

Traditionally, control areas were defined by utility 
service area boundaries and operations were 
largely managed by vertically integrated utilities 


that owned and operated generation, transmis¬ 
sion, and distribution. While that is still true in 
some areas, there has been significant restructur¬ 
ing of operating functions and some consolidation 
of control areas into regional operating entities. 
Utility industry restructuring has led to an 
unbundling of generation, transmission and dis¬ 
tribution activities such that the ownership and 
operation of these assets have been separated 
either functionally or through the formation of 
independent entities called Independent System 
Operators (ISOs) and Regional Transmission 
Organizations (RTOs). 

♦ ISOs and RTOs in the United States have been 
authorized by FERC to implement aspects of the 
Energy Policy Act of 1992 and subsequent FERC 
policy directives. 

♦ The primary functions of ISOs and RTOs are to 
manage in real time and on a day-ahead basis 
the reliability of the bulk power system and the 
operation of wholesale electricity markets 
within their footprint. 

♦ ISOs and RTOs do not own transmission assets; 
they operate or direct the operation of assets 
owned by their members. 

♦ ISOs and RTOs may be control areas them¬ 
selves, or they may encompass more than one 
control area. 

♦ ISOs and RTOs may also be NERC Reliability 
Coordinators, as described below. 

Five RTOs/ISOs are within the area directly 
affected by the August 14 blackout. They are: 

♦ Midwest Independent System Operator (MISO) 

♦ PJM Interconnection (PJM) 


Figure 2.5. NERC Regions 
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♦ New York Independent System Operator 
(NYISO) 

♦ New England Independent System Operator 
(ISO-NE) 

♦ Ontario Independent Market Operator (IMO) 

Reliability coordinators provide reliability over¬ 
sight over a wide region. They prepare reliability 
assessments, provide a wide-area view of reliabil¬ 
ity, and coordinate emergency operations in real 
time for one or more control areas. They may oper¬ 
ate, but do not participate in, wholesale or retail 
market functions. There are currently 18 reliabil¬ 
ity coordinators in North America. Figure 2.7 
shows the locations and boundaries of their 
respective areas. 

Key Parties in the Pre-Cascade 
Phase of the August 14 Blackout 

The initiating events of the blackout involved two 
control areas—FirstEnergy (FE) and American 


Electric Power (AEP)—and their respective reli¬ 
ability coordinators, MISO and PJM (see Figures 
2.7 and 2.8). These organizations and their reli¬ 
ability responsibilities are described briefly in this 
final subsection. 

1. FirstEnergy operates a control area in north¬ 
ern Ohio. FirstEnergy (FE) consists of seven 
electric utility operating companies. Four of 
these companies, Ohio Edison, Toledo Edison, 
The Illuminating Company, and Penn Power, 
operate in the NERC ECAR region, with MISO 
serving as their reliability coordinator. These 
four companies now operate as one integrated 
control area managed by FE. 4 

2. American Electric Power (AEP) operates a con¬ 
trol area in Ohio just south of FE. AEP is both a 
transmission operator and a control area 
operator. 

3. Midwest Independent System Operator 
(MISO) is the reliability coordinator for 
FirstEnergy. The Midwest Independent System 


Figure 2.6. NERC Regions and Control Areas 
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Operator (MISO) is the reliability coordinator 
for a region of more than 1 million square miles 
(2.6 million square kilometers), stretching from 
Manitoba, Canada in the north to Kentucky in 
the south, from Montana in the west to western 
Pennsylvania in the east. Reliability coordina¬ 
tion is provided by two offices, one in Minne¬ 
sota, and the other at the MISO headquarters in 
Indiana. Overall, MISO provides reliability 
coordination for 37 control areas, most of which 
are members of MISO. 


Reliability Responsibilities of Control 
Area Operators and Reliability 
Coordinators 

1. Control area operators have primary responsi¬ 
bility for reliability. Their most important 
responsibilities, in the context of this report, 
are: 

N-l criterion. NERC Operating Policy 2.A— 
Transmission Operations: 


4. PJM is AEP’s reliability coordinator. PJM is one 

of the original ISOs formed after FERC orders 
888 and 889, but was established as a regional 
power pool in 1935. PJM recently expanded its 
footprint to include control areas and transmis¬ 
sion operators within MAIN and ECAR (PJM- 
West). It performs its duties as a reliability coor¬ 
dinator in different ways, depending on the 
control areas involved. For PJM-East, it is 
both the control area and reliability coordinator 
for ten utilities, whose transmission systems 
span the Mid-Atlantic region of New Jersey, 
most of Pennsylvania, Delaware, Maryland, 
West Virginia, Ohio, Virginia, and the District of 
Columbia. The PJM-West facility has the reli¬ 
ability coordinator desk for five control areas 
(AEP, Commonwealth Edison, Duquesne Light, 
Dayton Power and Light, and Ohio Valley Elec¬ 
tric Cooperative) and three generation-only 
control areas (Duke Energy’s Washington 
County (Ohio) facility, Duke’s Lawrence 
County/Hanging Rock (Ohio) facility, and Alle¬ 
gheny Energy’s Buchanan (West Virginia) 
facility. 


Figure 2.7. NERC Reliability Coordinators 



“All Control Areas shall operate so that 
instability, uncontrolled separation, or cas¬ 
cading outages will not occur as a result of 
the most severe single contingency.” 

Emergency preparedness and emergency 
response. NERC Operating Policy 5—Emer¬ 
gency Operations, General Criteria: 

“Each system and CONTROL Area shall 
promptly take appropriate action to relieve 
any abnormal conditions, which jeopardize 
reliable Interconnection operation.” 

“Each system, CONTROL AREA, and Region 
shall establish a program of manual and auto¬ 
matic load shedding which is designed to 
arrest frequency or voltage decays that could 
result in an uncontrolled failure of compo¬ 
nents of the interconnection.” 

NERC Operating Policy 5.A—Coordination 
with Other Systems: 

“A system, CONTROL Area, or pool that is 
experiencing or anticipating an operating 
emergency shall communicate its current 
and future status to neighboring systems, 
Control Areas, or pools and throughout the 
interconnection .... A system shall inform 


Figure 2.8. Reliability Coordinators and Control 
Areas in Ohio and Surrounding States 
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other systems . . . whenever . . . the system’s 
condition is burdening other systems or 
reducing the reliability of the Interconnec¬ 
tion ... [or whenever] the system’s line load¬ 
ings and voltage/reactive levels are such that 
a single contingency could threaten the reli¬ 
ability of the Interconnection.” 

NERC Operating Policy 5.C—Transmission 
System Relief: 

“Action to correct an OPERATING SECURITY 
Limit violation shall not impose unaccept¬ 
able stress on internal generation or transmis¬ 
sion equipment, reduce system reliability 
beyond acceptable limits, or unduly impose 
voltage or reactive burdens on neighboring 
systems. If all other means fail, corrective 
action may require load reduction.” 


Operating personnel and training: NERC Oper¬ 
ating Policy 8.B—Training: 

“Each Operating Authority should period¬ 
ically practice simulated emergencies. The 
scenarios included in practice situations 
should represent a variety of operating condi¬ 
tions and emergencies.” 

2. Reliability Coordinators such as MISO and 
PJM are expected to comply with all aspects of 
NERC Operating Policies, especially Policy 9, 
Reliability Coordinator Procedures, and its 
appendices. Key requirements include: 

NERC Operating Policy 9, Criteria for Reliabil¬ 
ity Coordinators, 5.2: 

Have “detailed monitoring capability of the 
Reliability Area and sufficient monitoring 


Institutional Complexities and Reliability in the Midwest 


The institutional arrangements for reliability in 
the Midwest are much more complex than they 
are in the Northeast—i.e., the areas covered by 
the Northeast Power Coordinating Council 
(NPCC) and the Mid-Atlantic Area Council 
(MAAC). There are two principal reasons for this 
complexity. One is that in NPCC and MAAC, the 
independent system operator (ISO) also serves as 
the single control area operator for the individual 
member systems. In comparison, MISO provides 
reliability coordination for 35 control areas in the 
ECAR, MAIN, and MAPP regions and 2 others in 
the SPP region, and PJM provides reliability coor¬ 
dination for 8 control areas in the ECAR and 
MAIN regions (plus one in MAAC). (See table 
below.) This results in 18 control-area-to- 
control-area interfaces across the PJM/MISO reli¬ 
ability coordinator boundary. 

The other is that MISO has less reliability-related 
authority over its control area members than PJM 


has over its members. Arguably, this lack of 
authority makes day-to-day reliability operations 
more challenging. Note, however, that (1) FERC’s 
authority to require that MISO have greater 
authority over its members is limited; and (2) 
before approving MISO, FERC asked NERC for a 
formal assessment of whether reliability could be 
maintained under the arrangements proposed by 
MISO and PJM. After reviewing proposed plans 
for reliability coordination within and between 
PJM and MISO, NERC replied affirmatively but 
provisionally. FERC approved the new MISO- 
PJM configuration based on NERC’s assessment. 
NERC conducted audits in November and 
December 2002 of the MISO and PJM reliability 
plans, and some of the recommendations of the 
audit teams are still being addressed. The ade¬ 
quacy of the plans and whether the plans were 
being implemented as written are factors in 
NERC’s ongoing investigation. 


Reliability Coordinator (RC) 

Control 
Areas in 
RC Area 

Regional Reliability 
Councils Affected and 
Number of Control Areas 

Control Areas of Interest in RC Area 

MISO 

37 

ECAR (12), MAIN (9), 

MAPP (14), SPP (2) 

FE, Cinergy, 

Michigan Electric Coordinated System 

PJM 

9 

MAAC (1), ECAR (7), 

MAIN (1) 

PJM, AEP, 

Dayton Power & Light 

ISO New England 

2 

NPCC (2) 

ISONE, Maritime Provinces 

New York ISO 

1 

NPCC (1) 

NYISO 

Ontario Independent Market Operator 

1 

NPCC (1) 

IMO 

Trans-Energie 

1 

NPCC (1) 

Hydro Quebec 



14 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 















capability of the surrounding Reliability 
Areas to ensure potential security violations 
are identified.” 

NERC Operating Policy 9, Functions of Reliabil¬ 
ity Coordinators, 1.7: 

“Monitor the parameters that may have sig¬ 
nificant impacts within the RELIABILITY Area 
and with neighboring RELIABILITY AREAS 
with respect to . . . sharing with other 
RELIABILITY COORDINATORS any information 
regarding potential, expected, or actual criti¬ 
cal operating conditions that could nega¬ 
tively impact other RELIABILITY AREAS. The 
Reliability Coordinator will coordinate 
with other RELIABILITY COORDINATORS and 
Control Areas as needed to develop appro¬ 
priate plans to mitigate negative impacts of 
potential, expected, or actual critical operat¬ 
ing conditions . . . .” 


What Constitutes an Operating Emergency? 

An operating emergency is an unsustainable 
condition that cannot be resolved using the 
resources normally available. The NERC Oper¬ 
ating Manual defines a “capacity emergency” as 
when a system’s or pool’s operating generation 
capacity, plus firm purchases from other sys¬ 
tems, to the extent available or limited by trans¬ 
fer capability, is inadequate to meet its demand 
plus its regulating requirements. It defines an 
“energy emergency” as when a load-serving 
entity has exhausted all other options and can 
no longer provide its customers’ expected 
energy requirements. A transmission emer¬ 
gency exists when “the system’s line loadings 
and voltage/ reactive levels are such that a single 
contingency could threaten the reliability of the 
Interconnection.” Control room operators and 
dispatchers are given substantial latitude to 
determine when to declare an emergency. (See 
pages 66-67 in Chapter 5 for more detail.) 


NERC Operating Policy 9, Functions of Reliabil¬ 
ity Coordinators, 6: 

“Conduct security assessment and monitor¬ 
ing programs to assess contingency situa¬ 
tions. Assessments shall be made in real time 
and for the operations planning horizon at 
the CONTROL Area level with any identified 
problems reported to the RELIABILITY CO¬ 
ORDINATOR. The Reliability Coordinator 
is to ensure that CONTROL AREA, RELIABILITY 
Area, and regional boundaries are suffi¬ 
ciently modeled to capture any problems 
crossing such boundaries.” 

Endnotes 

1 The province of Quebec, although considered a part of the 
Eastern Interconnection, is connected to the rest of the East¬ 
ern Interconnection only by DC ties. In this instance, the DC 
ties acted as buffers between portions of the Eastern Intercon¬ 
nection; transient disturbances propagate through them less 
readily. Therefore, the electricity system in Quebec was not 
affected by the outage, except for a small portion of the prov¬ 
ince’s load that is directly connected to Ontario by AC trans¬ 
mission lines. (Although DC ties can act as a buffer between 
systems, the tradeoff is that they do not allow instantaneous 
generation support following the unanticipated loss of a gen¬ 
erating unit.) 

2 In some locations, bulk power flows are controlled through 
specialized devices or systems, such as phase angle regula¬ 
tors, “flexible AC transmission systems” (FACTS), and high- 
voltage DC converters (and reconverters) spliced into the AC 
system. These devices are still too expensive for general 
application. 

3 See, for example, Maintaining Reliability in a Competitive 
Electric Industry (1998), a report to the U.S. Secretary of 
Energy by the Task Force on Electric Systems Reliability; 
National Energy Policy (2001), a report to the President of the 
United States by the National Energy Policy Development 
Group, p. 7-6; and National Transmission Grid Study (2002), 
U.S. Dept, of Energy, pp. 46-48. 

4 The remaining three FE companies, Penelec, Met-Ed, and 
Jersey Central Power & Light, are in the NERC MAAC region 
and have PJM as their reliability coordinator. The focus of this 
report is on the portion of FE in the ECAR reliability region 
and within the MISO reliability coordinator footprint. 
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3. Causes of the Blackout 
and Violations of NERC Standards 


Summary 

This chapter explains in summary form the causes 
of the initiation of the blackout in Ohio, based on 
the analyses by the bi-national investigation team. 
It also lists NERC’s findings to date concerning 
seven specific violations of its reliability policies, 
guidelines, and standards. Last, it explains how 
some NERC standards and processes were inade¬ 
quate because they did not give sufficiently clear 
direction to industry members concerning some 
preventive measures needed to maintain reliabil¬ 
ity, and that NERC does not have the authority to 
enforce compliance with the standards. Clear 
standards with mandatory compliance, as con¬ 
templated under legislation pending in the U.S. 
Congress, might have averted the start of this 
blackout. 

Chapters 4 and 5 provide the details that support 
the conclusions summarized here, by describing 
conditions and events during the days before and 
the day of the blackout, and explain how those 
events and conditions did or did not cause or con¬ 
tribute to the initiation of the blackout. Chapter 6 
addresses the cascade as the blackout spread 
beyond Ohio and reviews the causes and events of 
the cascade as distinct from the earlier events in 
Ohio. 

The Causes of the Blackout in Ohio 

A dictionary definition of “cause” is “something 
that produces an effect, result, or consequence.” 1 
In searching for the causes of the blackout, the 
investigation team looked back through the pro¬ 
gression of sequential events, actions and inac¬ 
tions to identify the cause(s) of each event. The 
idea of “cause” is here linked not just to what hap¬ 
pened or why it happened, but more specifically 
to the entities whose duties and responsibilities 
were to anticipate and prepare to deal with the 
things that could go wrong. Four major causes, or 
groups of causes, are identified (see box on page 
18). 


Although the causes discussed below produced 
the failures and events of August 14, they did not 
leap into being that day. Instead, as the following 
chapters explain, they reflect long-standing insti¬ 
tutional failures and weaknesses that need to be 
understood and corrected in order to maintain 
reliability. 

Linking Causes 
to Specific Weaknesses 

Seven violations of NERC standards, as identified 
by NERC, 2 and other conclusions reached by 
NERC and the bi-national investigation team are 
aligned below with the specific causes of the 
blackout. There is an additional category of con¬ 
clusions beyond the four principal causes—the 
failure to act, when it was the result of preceding 
conditions. For instance, FE did not respond to the 
loss of its transmission lines because it did not 
have sufficient information or insight to reveal the 
need for action. Note: NERC’s list of violations has 
been revised and extended since publication of 
the Interim Report. Two violations (numbers 4 
and 6, as cited in the Interim Report) were 
dropped, and three new violations have been 
identified in this report (5, 6, and 7, as numbered 
here). NERC continues to study the record and 
may identify additional violations. 3 

Group 1: FirstEnergy and ECAR failed to assess 
and understand the inadequacies ofFE’s 
system, particularly with respect to voltage 
instability and the vulnerability of the 
Cleveland-Akron area, and FE did not operate 
its system with appropriate voltage criteria 
and remedial measures. 

♦ FE did not monitor and manage reactive 
reserves for various contingency conditions as 
required by NERC Policy 2, Section B, Require¬ 
ment 2. 

♦ NERC Policy 2, Section A, requires a 30-minute 
period of time to re-adjust the system to prepare 
to withstand the next contingency. 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 


17 


Causes of the Blackout’s Initiation 

The Ohio phase of the August 14, 2003, blackout 
was caused by deficiencies in specific practices, 
equipment, and human decisions by various 
organizations that affected conditions and out¬ 
comes that afternoon—for example, insufficient 
reactive power was an issue in the blackout, but 
it was not a cause in itself. Rather, deficiencies in 
corporate policies, lack of adherence to industry 
policies, and inadequate management of reactive 
power and voltage caused the blackout, rather 
than the lack of reactive power. There are four 
groups of causes for the blackout: 

Group 1: FirstEnergy and ECAR failed to 
assess and understand the inadequacies of 
FE’s system, particularly with respect to 
voltage instability and the vulnerability of 
the Cleveland-Akron area, and FE did not 
operate its system with appropriate voltage 
criteria. (Note: This cause was not identified in 
the Task Force’s Interim Report. It is based on 
analysis completed by the investigative team 
after the publication of the Interim Report.) 

As detailed in Chapter 4: 

A) FE failed to conduct rigorous long-term plan¬ 
ning studies of its system, and neglected to 
conduct appropriate multiple contingency or 
extreme condition assessments. (See pages 
37-39 and 41-43.) 

B) FE did not conduct sufficient voltage analyses 
for its Ohio control area and used operational 
voltage criteria that did not reflect actual volt¬ 
age stability conditions and needs. (See pages 
31-37.) 

C) ECAR (FE’s reliability council) did not con¬ 
duct an independent review or analysis of 
FE’s voltage criteria and operating needs, 
thereby allowing FE to use inadequate prac¬ 
tices without correction. (See page 39.) 

D) Some of NERC’s planning and operational 
requirements and standards were sufficiently 
ambiguous that FE could interpret them to 
include practices that were inadequate for reli¬ 
able system operation. (See pages 31-33.) 


Group 2: Inadequate situational awareness 
at FirstEnergy. FE did not recognize or 
understand the deteriorating condition of 
its system. 

As discussed in Chapter 5: 

A) FE failed to ensure the security of its transmis¬ 
sion system after significant unforeseen con¬ 
tingencies because it did not use an effective 
contingency analysis capability on a routine 
basis. (See pages 49-50 and 64.) 

B) FE lacked procedures to ensure that its opera¬ 
tors were continually aware of the functional 
state of their critical monitoring tools. (See 
pages 51-53, 56.) 

C) FE control center computer support staff and 
operations staff did not have effective internal 
communications procedures. (See pages 54, 
56, and 65-67.) 

D) FE lacked procedures to test effectively the 
functional state of its monitoring tools after 
repairs were made. (See page 54.) 

E) FE did not have additional or back-up moni¬ 
toring tools to understand or visualize the sta¬ 
tus of their transmission system to facilitate 
its operators’ understanding of transmission 
system conditions after the failure of their pri¬ 
mary monitoring/alarming systems. (See 
pages 53, 56, and 65.) 

Group 3: FE failed to manage adequately tree 
growth in its transmission rights-of-way. 

This failure was the common cause of the outage 
of three FE 345-kV transmission lines and one 
138-kV line. (See pages 57-64.) 

Group 4: Failure of the interconnected grid’s 
reliability organizations to provide effective 
real-time diagnostic support. 

As discussed in Chapter 5: 

A) MISO did not have real-time data from 
Dayton Power and Light’s Stuart-Atlanta 
345-kV line incorporated into its state estima¬ 
tor (a system monitoring tool). This precluded 

(continued on page 19) 
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Causes of the Blackout’s Initiation (Continued) 


MISO from becoming aware of FE’s system 
problems earlier and providing diagnostic 
assistance or direction to FE. (See pages 
49-50.) 

B) MISO’s reliability coordinators were using 
non-real-time data to support real-time 
“flowgate” monitoring. This prevented MISO 
from detecting an N-l security violation in 
FE’s system and from assisting FE in neces¬ 
sary relief actions. (See pages 48 and 63.) 

C) MISO lacked an effective way to identify the 
location and significance of transmission line 
breaker operations reported by their Energy 
Management System (EMS). Such informa¬ 
tion would have enabled MISO operators to 
become aware earlier of important line out¬ 
ages. (See page 48.) 


D) PJM and MISO lacked joint procedures or 
guidelines on when and how to coordinate a 
security limit violation observed by one of 
them in the other’s area due to a contingency 
near their common boundary. (See pages 
62-63 and 65-66.) 


In the chapters that follow, sections that relate to 
particular causes are denoted with the following 
symbols: 


Cause 1 


Cause 2 


Inadequate 

System 

Understanding 


Cause 3 


Inadequate 

Tree 

Trimming 


Inadequate 

Situational 

Awareness 


Cause 4 


Inadequate 
RC Diagnostic 
Support 


♦ NERC is lacking a well-defined control area 
(CA) audit process that addresses all CA respon¬ 
sibilities. Control area audits have generally not 
been conducted with sufficient regularity and 
have not included a comprehensive audit of the 
control area’s compliance with all NERC and 
Regional Council requirements. Compliance 
with audit results is not mandatory. 

♦ ECAR did not conduct adequate review or anal¬ 
yses of FE’s voltage criteria, reactive power 
management practices, and operating needs. 

♦ FE does not have an adequate automatic under¬ 
voltage load-shedding program in the Cleve¬ 
land-Akron area. 

Group 2: Inadequate situational awareness 
at FirstEnergy. FE did not recognize or 
understand the deteriorating condition of 
its system. 

Violations (Identified by NERC): 

♦ Violation 7: FE’s operational monitoring equip¬ 
ment was not adequate to alert FE’s operators 
regarding important deviations in operating 
conditions and the need for corrective action as 
required by NERC Policy 4, Section A, Require¬ 
ment 5. 

♦ Violation 3: FE’s state estimation and contin¬ 
gency analysis tools were not used to assess 
system conditions, violating NERC Operating 
Policy 5, Section C, Requirement 3, and Policy 
4, Section A, Requirement 5. 


Other Problems: 

♦ FE personnel did not ensure that their 
Real-Time Contingency Analysis (RTCA) was a 
functional and effective EMS application as 
required by NERC Policy 2, Section A, Require¬ 
ment 1. 

♦ FE’s operational monitoring equipment was not 
adequate to provide a means for its operators to 
evaluate the effects of the loss of significant 
transmission or generation facilities as required 
by NERC Policy 4, Section A, Requirement 4. 

♦ FE’s operations personnel were not provided 
sufficient operations information and analysis 
tools as required by NERC Policy 5, Section C, 
Requirement 3. 

♦ FE’s operations personnel were not adequately 
trained to maintain reliable operation under 
emergency conditions as required by NERC Pol¬ 
icy 8, Section 1. 

♦ NERC Policy 4 has no detailed requirements for: 
(a) monitoring and functional testing of critical 
EMS and supervisory control and data acquisi¬ 
tion (SCADA) systems, and (b) contingency 
analysis. 

♦ NERC Policy 6 includes a requirement to plan 
for loss of the primary control center, but lacks 
specific provisions concerning what must be 
addressed in the plan. 

♦ NERC system operator certification tests for 
basic operational and policy knowledge. 
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Significant additional training is needed to 
qualify an individual to perform system opera¬ 
tion and management functions. 

Group 3: FE failed to manage adequately tree 
growth in its transmission rights-of-way. This 
failure was the common cause of the outage of 
three FE 345-kV transmission lines and 
affected several 138-kV lines. 

♦ FE failed to maintain equipment ratings 
through a vegetation management program. A 
vegetation management program is necessary to 
fulfill NERC Policy 2, Section A, Requirement 1 
(Control areas shall develop, maintain, and 
implement formal policies and procedures to 
provide for transmission security . . . including 
equipment ratings.) 

♦ Vegetation management requirements are not 
defined in NERC Standards and Policies. 

Group 4: Failure of the interconnected grid’s 
reliability organizations to provide effective 
diagnostic support. 

Violations (Identified by NERC): 

♦ Violation 4: MISO did not notify other reliabil¬ 
ity coordinators of potential system problems as 
required by NERC Policy 9, Section C, Require¬ 
ment 2. 

♦ Violation 5: MISO was using non-real-time data 
to support real-time operations, in violation of 
NERC Policy 9, Appendix D, Section A, Criteria 
5.2. 

♦ Violation 6: PJM and MISO as reliability coordi¬ 
nators lacked procedures or guidelines between 
their respective organizations regarding the 
coordination of actions to address an operating 
security limit violation observed by one of them 
in the other’s area due to a contingency near 
their common boundary, as required by Policy 
9, Appendix C. Note: Policy 9 lacks specifics on 
what constitutes coordinated procedures and 
training. 

Other Problems: 

♦ MISO did not have adequate monitoring capa¬ 
bility to fulfill its reliability coordinator respon¬ 
sibilities as required by NERC Policy 9, 
Appendix D, Section A. 

♦ Although MISO is the reliability coordinator for 
FE, on August 14 FE was not a signatory to the 


MISO Transmission Owners Agreement and 
was not under the MISO tariff, so MISO did not 
have the necessary authority as FE’s Reliability 
Coordinator as required by NERC Policy 9, Sec¬ 
tion B, Requirement 2. 

♦ Although lacking authority under a signed 
agreement, MISO as reliability coordinator nev¬ 
ertheless should have issued directives to FE to 
return system operation to a safe and reliable 
level as required by NERC Policy 9, Section B, 
Requirement 2, before the cascading outages 
occurred. 

♦ American Electric Power (AEP) and PJM 
attempted to use the transmission loading relief 
(TLR) process to address transmission power 
flows without recognizing that a TLR would not 
solve the problem. 

♦ NERC Policy 9 does not contain a requirement 
for reliability coordinators equivalent to the 
NERC Policy 2 statement that monitoring 
equipment is to be used in a manner that would 
bring to the reliability coordinator’s attention 
any important deviations in operating 
conditions. 

♦ NERC Policy 9 lacks criteria for determining the 
critical facilities lists in each reliability coordi¬ 
nator area. 

♦ NERC Policy 9 lacks specifics on coordinated 
procedures and training for reliability coordina¬ 
tors regarding “operating to the most conserva¬ 
tive limit” in situations when operating 
conditions are not fully understood. 

Failures to act by FirstEnergy or others to solve 

the growing problem, due to the other causes. 

Violations (Identified by NERC): 

♦ Violation 1: Following the outage of the Cham- 
berlin-Harding 345-kV line, FE operating per¬ 
sonnel did not take the necessary action to 
return the system to a safe operating state as 
required by NERC Policy 2, Section A, Standard 
1 . 

♦ Violation 2: FE operations personnel did not 
adequately communicate its emergency operat¬ 
ing conditions to neighboring systems as 
required by NERC Policy 5, Section A. 

Other Problems: 

♦ FE operations personnel did not promptly take 
action as required by NERC Policy 5, General 
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Criteria, to relieve the abnormal conditions 
resulting from the outage of the Harding- 
Chamberlin 345-kV line. 

♦ FE operations personnel did not implement 
measures to return system operation to within 
security limits in the prescribed time frame 
of NERC Policy 2, Section A, Standard 2, follow¬ 
ing the outage of the Harding-Chamberlin 
345-kV line. 

♦ FE operations personnel did not exercise the 
authority to alleviate the operating security 
limit violation as required by NERC Policy 5, 
Section C, Requirement 2. 

♦ FE did not exercise a load reduction program to 
relieve the critical system operating conditions 
as required by NERC Policy 2, Section A, 
Requirement 1.2. 

♦ FE did not demonstrate the application of 
effective emergency operating procedures as 
required by NERC Policy 6, Section B, Emer¬ 
gency Operations Criteria. 

♦ FE operations personnel did not demonstrate 
that FE has an effective manual load shedding 
program designed to address voltage decays 
that result in uncontrolled failure of compo¬ 
nents of the interconnection as required by 
NERC Policy 5, General Criteria. 

♦ NERC Policy 5 lacks specifics for Control Areas 
on procedures for coordinating with other sys¬ 
tems and training regarding “operating to the 
most conservative limit” in situations when 
operating conditions are not fully understood. 

Institutional Issues 

As indicated above, the investigation team identi¬ 
fied a number of institutional issues with respect 
to NERC’s reliability standards. Many of the insti¬ 
tutional problems arise not because NERC is an 
inadequate or ineffective organization, but rather 
because it has no structural independence from 
the industry it represents and has no authority to 
develop strong reliability standards and to enforce 
compliance with those standards. While many in 
the industry and at NERC support such measures, 
legislative action by the U.S. Congress is needed to 
make this happen. 

These institutional issues can be summed up 
generally: 


1. Although NERC’s provisions address many of 
the factors and practices which contributed to 
the blackout, some of the policies or guidelines 
are inexact, non-specific, or lacking in detail, 
allowing divergent interpretations among reli¬ 
ability councils, control areas, and reliability 
coordinators. NERC standards are minimum 
requirements that may be made more stringent 
if appropriate by regional or subregional bodies, 
but the regions have varied in their willingness 
to implement exacting reliability standards. 

2. NERC and the industry’s reliability community 
were aware of the lack of specificity and detail 
in some standards, including definitions of 
Operating Security Limits, definition of 
planned outages, and delegation of Reliability 
Coordinator functions to control areas, but they 
moved slowly to address these problems 
effectively. 

3. Some standards relating to the blackout’s 
causes lack specificity and measurable compli¬ 
ance criteria, including those pertaining to 
operator training, back-up control facilities, 
procedures to operate when part or all of the 
EMS fails, emergency procedure training, 
system restoration plans, reactive reserve 
requirements, line ratings, and vegetation 
management. 

4. The NERC compliance program and region- 
based auditing process has not been compre¬ 
hensive or aggressive enough to assess the capa¬ 
bility of all control areas to direct the operation 
of their portions of the bulk power system. The 
effectiveness and thoroughness of regional 
councils’ efforts to audit for compliance with 
reliability requirements have varied signifi¬ 
cantly from region to region. Equally important, 
absent mandatory compliance and penalty 
authority, there is no requirement that an entity 
found to be deficient in an audit must remedy 
the deficiency. 

5. NERC standards are frequently administrative 
and technical rather than results-oriented. 

6. A recently-adopted NERC process for develop¬ 
ment of standards is lengthy and not yet fully 
understood or applied by many industry partic¬ 
ipants. Whether this process can be adapted to 
support an expedited development of clear and 
auditable standards for key topics remains to be 
seen. 
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7. NERC has not had an effective process to ensure 
that recommendations made in various reports 
and disturbance analyses are tracked for 
accountability. On their own initiative, some 
regional councils have developed effective 
tracking procedures for their geographic areas. 

Control areas and reliability coordinators operate 
the grid every day under guidelines, policies, and 
requirements established by the industry’s reli¬ 
ability community under NERC’s coordination. If 
those policies are strong, clear, and unambiguous, 
then everyone will plan and operate the system at 
a high level of performance and reliability will be 
high. But if those policies are ambiguous and do 
not make entities’ roles and responsibilities clear 
and certain, they allow companies to perform at 
varying levels and system reliability is likely to be 
compromised. 

Given that NERC has been a voluntary organiza¬ 
tion that makes decisions based on member votes, 
if NERC’s standards have been unclear, non¬ 
specific, lacking in scope, or insufficiently strict, 
that reflects at least as much on the industry com¬ 
munity that drafts and votes on the standards as it 
does on NERC. Similarly, NERC’s ability to obtain 
compliance with its requirements through its 
audit process has been limited by the extent to 
which the industry has been willing to support the 
audit program. 

Endnotes 

1 Webster’s II New Riverside University Dictionary, Riverside 
Publishing Co., 1984. 

2 A NERC team looked at whether and how violations of 
NERC’s reliability requirements may have occurred in the 
events leading up to the blackout. They also looked at 
whether deficiencies in the requirements, practices and pro¬ 
cedures of NERC and the regional reliability organizations 
may have contributed to the blackout. They found seven spe¬ 
cific violations of NERC operating policies (although some are 
qualified by a lack of specificity in the NERC requirements). 

The Standards, Procedures and Compliance Investigation 
Team reviewed the NERC Policies for violations, building on 
work and going beyond work done by the Root Cause Analy¬ 
sis Team. Based on that review the Standards team identified 
a number of violations related to policies 2, 4, 5, and 9. 

Violation 1: Following the outage of the Chamberlin- 
Harding 345-kV line, FE did not take the necessary actions to 
return the system to a safe operating state within 30 minutes. 


(While Policy 5 on Emergency Operations does not address 
the issue of “operating to the most conservative limit” when 
coordinating with other systems and operating conditions are 
not understood, other NERC policies do address this matter: 
Policy 2, Section A, Standard 1, on basic reliability for single 
contingencies; Policy 2, Section A, Standard 2, to return a sys¬ 
tem to within operating security limits within 30 minutes; 
Policy 2, Section A, Requirement 1, for formal policies and 
procedures to provide for transmission security; Policy 5, 
General Criteria, to relieve any abnormal conditions that jeop¬ 
ardize reliable operation; Policy 5, Section C, Requirement 1, 
to relieve security limit violations; and Policy 5, Section 2, 
Requirement 2, which gives system operators responsibility 
and authority to alleviate operating security limit violations 
using timely and appropriate actions.) 

Violation 2: FE did not notify other systems of an impend¬ 
ing system emergency. (Policy 5, Section A, Requirement 1, 
directs a system to inform other systems if it is burdening oth¬ 
ers, reducing system reliability, or if its lack of single contin¬ 
gency coverage could threaten Interconnection reliability. 
Policy 5, Section A, Criteria, has similar provisions.) 

Violation 3: FE’s state estimation/contingency analysis 
tools were not used to assess the system conditions. (This is 
addressed in Operating Policy 5, Section C, Requirement 3, 
concerning assessment of Operating Security Limit viola¬ 
tions, and Policy 4, Section A, Requirement 5, which 
addresses using monitoring equipment to inform the system 
operator of important conditions and the potential need for 
corrective action.) 

Violation 4: MISO did not notify other reliability coordina¬ 
tors of potential problems. (Policy 9, Section C, Requirement 
2, directing the reliability coordinator to alert all control areas 
and reliability coordinators of a potential transmission prob¬ 
lem.) 

Violation 5: MISO was using non-real-time data to support 
real-time operations. (Policy 9, Appendix D, Section A, Crite¬ 
ria For Reliability Coordinators 5.2, regarding adequate facili¬ 
ties to perform their responsibilities, including detailed 
monitoring capability to identify potential security viola¬ 
tions.) 

Violation 6: PJM and MISO as Reliability Coordinators 
lacked procedures or guidelines between themselves on when 
and how to coordinate an operating security limit violation 
observed by one of them in the other’s area due to a contin¬ 
gency near their common boundary (Policy 9, Appendix 9C, 
Emergency Procedures). Note: Since Policy 9 lacks specifics 
on coordinated procedures and training, it was not possible 
for the bi-national team to identify the exact violation that 
occurred. 

Violation 7: The monitoring equipment provided to FE 
operators was not sufficient to bring the operators’ attention 
to the deviation on the system. (Policy 4, Section A, System 
Monitoring Requirements regarding resource availability and 
the use of monitoring equipment to alert operators to the need 
for corrective action.) 

3 NERC has not yet completed its review of planning stan¬ 
dards and violations. 
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4. Context and Preconditions for the Blackout 
The Northeastern Power Grid 
Before the Blackout Began 


Summary 

This chapter reviews the state of the northeast por¬ 
tion of the Eastern Interconnection during the 
days and hours before 16:00 EDT on August 14, 
2003, to determine whether grid conditions before 
the blackout were in some way unusual and might 
have contributed to the initiation of the blackout. 
Task Force investigators found that at 15:05 East¬ 
ern Daylight Time, immediately before the trip¬ 
ping (automatic shutdown) of FirstEnergy’s (FE) 
Harding-Chamberlin 345-kV transmission line, 
the system was electrically secure and was able to 
withstand the occurrence of any one of more than 
800 contingencies, including the loss of the Har¬ 
ding-Chamberlin line. At that time the system was 
electrically within prescribed limits and in com¬ 
pliance with NERC’s operating policies. 

Determining that the system was in a reliable 
operational state at 15:05 EDT on August 14, 2003, 
is extremely significant for determining the causes 
of the blackout. It means that none of the electrical 
conditions on the system before 15:05 EDT was a 
direct cause of the blackout. This eliminates a 
number of possible causes of the blackout, 
whether individually or in combination with one 
another, such as: 

♦ Unavailability of individual generators or trans¬ 
mission lines 

♦ High power flows across the region 

♦ Low voltages earlier in the day or on prior days 

♦ System frequency variations 

♦ Low reactive power output from independent 
power producers (IPPs). 

This chapter documents that although the system 
was electrically secure, there was clear experience 
and evidence that the Cleveland-Akron area was 
highly vulnerable to voltage instability problems. 
While it was possible to operate the system 


securely despite those vulnerabilities, FirstEnergy 
was not doing so because the company had not 
conducted the long-term and operational planning 
studies needed to understand those vulnerabili¬ 
ties and their operational implications. 

It is important to emphasize that establishing 
whether conditions were normal or unusual prior 
to and on August 14 does not change the responsi¬ 
bilities and actions expected of the organizations 
and operators charged with ensuring power sys¬ 
tem reliability. As described in Chapter 2, the elec¬ 
tricity industry has developed and codified a set of 
mutually reinforcing reliability standards and 
practices to ensure that system operators are 
prepared for the unexpected. The basic assump¬ 
tion underlying these standards and practices 
is that power system elements will fail or 
become unavailable in unpredictable ways and at 


Reliability and Security 

NERC—and this report—use the following defi¬ 
nitions for reliability, adequacy, and security. 

Reliability: The degree of performance of the 
elements of the bulk electric system that results 
in electricity being delivered to customers 
within accepted standards and in the amount 
desired. Reliability may be measured by the fre¬ 
quency, duration, and magnitude of adverse 
effects on the electricity supply. 

Adequacy: The ability of the electric system to 
supply the aggregate electrical demand and 
energy requirements of the customers at all 
times, taking into account scheduled and rea¬ 
sonably expected unscheduled outages of sys¬ 
tem elements. 

Security: The ability of the electric system to 
withstand sudden disturbances such as electric 
short circuits or unanticipated loss of system 
elements. 
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unpredictable times. Sound reliability manage¬ 
ment is designed to ensure that operators can con¬ 
tinue to operate the system within appropriate 
thermal, voltage, and stability limits following the 
unexpected loss of any key element (such as a 
major generator or key transmission facility). 
These practices have been designed to maintain a 
functional and reliable grid, regardless of whether 
actual operating conditions are normal. 


It is a basic principle of reliability management 
that “operators must operate the system they have 
in front of them”—unconditionally. The system 
must be operated at all times to withstand any sin¬ 
gle contingency and yet be ready within 30 min¬ 
utes for the next contingency. If a facility is lost 
unexpectedly, the system operators must deter¬ 
mine whether to make operational changes, 
including adjusting generator outputs, curtailing 


Geography Lesson 

In analyzing the August 14 blackout, it is crucial 
to understand the geography of the FirstEnergy 
area. FirstEnergy has seven subsidiary distribu¬ 
tion utilities: Toledo Edison, Ohio Edison, and 
The Illuminating Company in Ohio and four 
more in Pennsylvania and New Jersey. Its Ohio 
control area spans the three Ohio distribution 
utility footprints and that of Cleveland Public 
Power, a municipal utility serving the city of 
Cleveland. Within FE’s Ohio control area is the 
Cleveland-Akron area, shown in red cross-hatch. 


This geographic distinction matters because 
the Cleveland-Akron area is a transmission- 
constrained load pocket with relatively limited 
generation. While some analyses of the blackout 
refer to voltages and other indicators measured at 
the boundaries of FE’s Ohio control area, those 
indicators have limited relevance to the black¬ 
out—the indicators of conditions at the edges of 
and within the Cleveland-Akron area are the 
ones that matter. 



Area 

All-Time Peak Load (MW) 

Load on August 14, 2003 (MW) 

Cleveland-Akron Area 
(including Cleveland Public Power) 

7,340 

6,715 

FirstEnergy Control Area, Ohio 

13,299 

12,165 

FirstEnergy Retail Area, including PJM 

24,267 

22,631 


NA = not applicable. 
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electricity transactions, taking transmission ele¬ 
ments out of service or restoring them, and if nec¬ 
essary, shedding interruptible and firm customer 
load—i.e., cutting some customers off tempo¬ 
rarily, and in the right locations, to reduce elec¬ 
tricity demand to a level that matches what the 
system is then able to deliver safely. 

This chapter discusses system conditions in and 
around northeast Ohio on August 14 and their rel¬ 
evance to the blackout. It reviews electric loads 
(real and reactive), system topology (transmission 
and generation equipment availability and capa¬ 
bilities), power flows, voltage profiles and reactive 
power reserves. The discussion examines actual 
system data, investigation team modeling results, 
and past FE and AEP experiences in the Cleve- 
land-Akron area. The detailed analyses will be 
presented in a NERC technical report. 

Electric Demands on August 14 

Temperatures on August 14 were hot but in a nor¬ 
mal range throughout the northeast region of the 
United States and in eastern Canada (Figure 4.1). 
Electricity demands were high due to high air con¬ 
ditioning loads typical of warm days in August, 
though not unusually so. As the temperature 
increased from 78°F (26°C) on August 11 to 
87°F (31°C) on August 14, peak load within 
FirstEnergy’s control area increased by 20%, from 
10,095 MW to 12,165 MW. System operators had 
successfully managed higher demands in north¬ 
east Ohio and across the Midwest, both earlier in 
the summer and in previous years—historic peak 
load for FE’s control area was 13,299 MW. August 
14 was FE’s peak demand day in 2003. 

Several large operators in the Midwest consis¬ 
tently under-forecasted load levels between 


August 11 and 14. Figure 4.2 shows forecast and 
actual power demands for AEP, Michigan Electri¬ 
cal Coordinated Systems (MECS), and FE from 
August 11 through August 14. Variances between 
actual and forecast loads are not unusual, but 
because those forecasts are used for day-ahead 
planning for generation, purchases, and reactive 
power management, they can affect equipment 
availability and schedules for the following day. 

The existence of high air conditioning loads across 
the Midwest on August 14 is relevant because air 
conditioning loads (like other induction motors) 
have lower power factors than other customer 
electricity uses, and consume more reactive 
power. Because it had been hot for several days in 
the Cleveland-Akron area, more air conditioners 
were running to overcome the persistent heat, and 
consuming relatively high levels of reactive 
power—further straining the area’s limited reac¬ 
tive generation capabilities. 

Generation Facilities Unavailable 
on August 14 

Several key generators in the region were out of 
service going into the day of August 14. On any 
given day, some generation and transmission 
capacity is unavailable; some facilities are out for 
routine maintenance, and others have been forced 
out by an unanticipated breakdown and require 
repairs. August 14, 2003, in northeast Ohio was no 
exception (Table 4.1). 

The generating units that were not available on 
August 14 provide real and reactive power directly 
to the Cleveland, Toledo, and Detroit areas. Under 
standard practice, system operators take into 
account the unavailability of such units and any 


Figure 4.1. August 2003 Temperatures in the U.S. 


Northeast and Eastern Canada 



Figure 4.2. Load Forecasts Below Actuals, 
August 11 through 14 _ 


Load (MW) 


Temperature (°F) 



90 (33°C) 
85 (30°C) 
80 (27°C) 
75 (24 C) 
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transmission facilities known to be out of service 
in the day-ahead planning studies they perform to 
ensure a secure system for the next day. Knowing 
the status of key facilities also helps operators 
determine in advance the safe electricity transfer 
levels for the coming day. 

MISO’s day-ahead planning studies for August 14 
took the above generator outages and transmission 
outages reported to MISO into account and 


determined that the regional system could be 
operated safely. The unavailability of these gener¬ 
ation units did not cause the blackout. 

On August 14 four or five capacitor banks within 
the Cleveland-Akron area had been removed from 
service for routine inspection, including capacitor 
banks at Fox and Avon 138-kV substations. 1 
These static reactive power sources are important 
for voltage support, but were not restored to 


Table 4.1. Generators Not Available on August 14 


Generator 

Rating 

Reason 

Davis-Besse Nuclear Unit 

883 MW 

Prolonged NRC-ordered outage beginning on 3/22/02 

Sammis Unit 3 

180 MW 

Forced outage on 8/12/03 

Eastlake Unit 4 

238 MW 

Forced outage on 8/13/03 

Monroe Unit 1 

817 MW 

Planned outage, taken out of service on 8/8/03 

Cook Nuclear Unit 2 

1,060 MW 

Outage began on 8/13/03 


Load Power Factors and Reactive Power 

Load power factor is a measure of the relative 
magnitudes of real power and reactive power 
consumed by the load connected to a power sys¬ 
tem. Resistive load, such as electric space heaters 
or incandescent lights, consumes only real 
power and no reactive power and has a load 
power factor of 1.0. Induction motors, which are 
widely used in manufacturing processes, min¬ 
ing, and homes (e.g., air-conditioners, fan motors 
in forced-air furnaces, and washing machines) 
consume both real power and reactive power. 
Their load power factors are typically in the 
range of 0.7 to 0.9 during steady-state operation. 
Single-phase small induction motors (e.g., 
household items) generally have load power fac¬ 
tors in the lower range. 

The lower the load power factor, the more reac¬ 
tive power is consumed by the load. For exam¬ 
ple, a 100 MW load with a load power factor of 
0.92 consumes 43 MVAr of reactive power, while 
the same 100 MW of load with a load power fac¬ 
tor of 0.88 consumes 54 MVAr of reactive power. 
Under depressed voltage conditions, the induc¬ 
tion motors used in air-conditioning units and 
refrigerators, which are used more heavily on hot 
and humid days, draw even more reactive power 
than under normal voltage conditions. 

In addition to end-user loads, transmission ele¬ 
ments such as transformers and transmission 
lines consume reactive power. Reactive power 
compensation is required at various locations in 
the network to support the transmission of real 


power. Reactive power is consumed within 
transmission lines in proportion to the square of 
the electric current shipped, so a 10% increase of 
power transfer will require a 21% increase in 
reactive power generation to support the power 
transfer. 

In metropolitan areas with summer peaking 
loads, it is generally recognized that as tempera¬ 
tures and humidity increase, load demand 
increases significantly. The power factor impact 
can be quite large—for example, for a metropoli¬ 
tan area of 5 million people, the shift from winter 
peak to summer peak demand can shift peak load 
from 9,200 MW in winter to 10,000 MW in sum¬ 
mer; that change to summer electric loads can 
shift the load power factor from 0.92 in winter 
down to 0.88 in summer; and this will increase 
the MVAr load demand from 3,950 in winter up 
to 5,400 in summer—all due to the changed com¬ 
position of end uses and the load factor influ¬ 
ences noted above. 

Reactive power does not travel far, especially 
under heavy load conditions, and so must be 
generated close to its point of consumption. This 
is why urban load centers with summer peaking 
loads are generally more susceptible to voltage 
instability than those with winter peaking loads. 
Thus, control areas must continually monitor 
and evaluate system conditions, examining reac¬ 
tive reserves and voltages, and adjust the system 
as necessary for secure operation. 
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service that afternoon despite the system opera¬ 
tors’ need for more reactive power in the area. 2 
Normal utility practice is to inspect and maintain 
reactive resources in off-peak seasons so the facili¬ 
ties will be fully available to meet peak loads. 


Cause 1 


Inadequate 
System 
Understanding 


The unavailability of the critical 
reactive resources was not known 
to those outside of FirstEnergy. 
NERC policy requires that critical 
facilities be identified and that 
neighboring control areas and reliability coordina¬ 
tors be made aware of the status of those facilities 
to identify the impact of those conditions on their 
own facilities. However, FE never identified these 
capacitor banks as critical 
and so did not pass on sta¬ 
tus information to others. 


Recommendations 


23, page 160; 30, page 1631 


Unanticipated Outages of 
Transmission and Generation 
on August 14 

Three notable unplanned outages occurred in 
Ohio and Indiana on August 14 before 15:05 EDT. 
Around noon, several Cinergy transmission lines 
in south-central Indiana tripped; at 13:31 EDT, 
FE’s Eastlake 5 generating unit along the south¬ 
western shore of Lake Erie tripped; at 14:02 EDT, a 
line within the Dayton Power and Light (DPL) con¬ 
trol area, the Stuart-Atlanta 345-kV line in south¬ 
ern Ohio, tripped. Only the Eastlake 5 trip was 
electrically significant to the FirstEnergy system. 

♦ Transmission lines on the Cinergy 345-, 230-, 
and 138-kV systems experienced a series of out¬ 
ages starting at 12:08 EDT and remained out of 
service during the entire blackout. The loss of 
these lines caused significant voltage and load¬ 
ing problems in the Cinergy area. Cinergy made 
generation changes, and MISO operators 
responded by implementing transmission load¬ 
ing relief (TLR) procedures to control flows on 
the transmission system in south-central Indi¬ 
ana. System modeling by the investigation team 
(see details below, pages 41-43) showed that the 
loss of these lines was not electrically related to 
subsequent events in northern Ohio that led to 
the blackout. 

♦ The Stuart-Atlanta 345-kV line, operated by 
DPL, and monitored by the PJM reliability coor¬ 
dinator, tripped at 14:02 EDT. This was the 
result of a tree contact, and the line remained 
out of service the entire afternoon. As explained 
below, system modeling by the investigation 


team has shown that this outage did not cause 
the subsequent events in northern Ohio that led 
to the blackout. However, since the line was not 
in MISO’s footprint, MISO operators did not 
monitor the status of this line and did not know 
it had gone out of service. This led to a data mis¬ 
match that prevented MISO’s state estimator (a 
key monitoring tool) from producing usable 
results later in the day at a time when system 
conditions in FE’s control area were deteriorat¬ 
ing (see details below, 


Recommendation 


pages 46 and 48-49). 


30, page 163 


♦ Eastlake Unit 5 is a 597 MW (net) generating 
unit located west of Cleveland on Lake Erie. It is 
a major source of reactive power support for the 
Cleveland area. It tripped at 13:31 EDT. The 
cause of the trip was that as the Eastlake 5 oper¬ 
ator sought to increase the unit’s reactive power 
output (Figure 4.3), the unit’s protection system 
detected that VAr output exceeded the unit’s 
VAr capability and tripped the unit off-line. The 
loss of the Eastlake 5 unit did not put the grid 
into an unreliable state—i.e., it was still able to 
withstand safely another contingency. How¬ 
ever, the loss of the unit required FE to import 
additional power to make up for the loss of the 
unit’s output (612 MW), made voltage manage¬ 
ment in northern Ohio more challenging, and 
gave FE operators less flexibility in operating 
their system (see details on pages 45-46 and 
49-50). 


Key Parameters for the 
Cleveland-Akron Area 
at 15:05 EDT 

The investigation team benchmarked their power 
flow models against measured data provided by 

Figure 4.3. MW and MVAr Output from Eastlake 
Unit 5 on August 14 


MW /MVAr kV 
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FirstEnergy for the Cleveland-Akron area at 15:05 
EDT (just before the first of FirstEnergy’s key 
transmission lines failed), as shown in Table 4.2. 
Although the modeled figures do not match actual 
system conditions perfectly, overall this model 
shows a very high correspondence to the actual 
occurrences and thus its results merit a high 
degree of confidence. Although Table 4.2 shows 
only a few key lines within the Cleveland-Akron 
area, the model was successfully benchmarked to 
match actual flows, line-by-line, very closely 
across the entire area for the afternoon of August 
14, 2003. 

The power flow model assumes the following sys¬ 
tem conditions for the Cleveland-Akron area at 
15:05 EDT on August 14: 

♦ Cleveland-Akron area load = 6,715 MW and 
2,402 MVAr 

♦ Transmission losses = 189 MW and 2,514 
MVAr 

♦ Reactive power from fixed shunt capacitors (all 
voltage levels) = 2,585 MVAr 

♦ Reactive power from line charging (all voltage 
levels) = 739 MVAr 

♦ Network configuration = after the loss of 
Eastlake 5, before the loss of Harding- 
Chamberlin 345-kV line 

♦ Area generation combined output: 3,000 MW 
and 1,200 MVAr. 

Given these conditions, the power 
flow model indicates that about 
3,900 MW and 400 MVAr of real 
power and reactive power flow 
into the Cleveland-Akron area 
was needed to meet the sum of customer load 
demanded plus line losses. There was about 688 
MVAr of reactive reserve from generation in the 
area, which is slightly more than the 660 MVAr 
reactive capability of the Perry nuclear unit. Com¬ 
bined with the fact that a 5% reduction in operat¬ 
ing voltage would cause a 10% reduction in 


reactive power (330 MVAr) from shunt capacitors 
and line charging and a 10% increase (250 MVAr) 
in reactive losses from transmission lines, these 
parameters indicate that the Cleveland-Akron area 
would be precariously short of reactive power if 
the Perry plant were lost. 

Power Flow Patterns 

Several commentators have suggested that the 
voltage problems in northeast Ohio and the subse¬ 
quent blackout occurred due to unprecedented 
high levels of inter-regional power transfers occur¬ 
ring on August 14. Investigation team analysis 
indicates that in fact, power transfer levels were 
high but were within established limits and previ¬ 
ously experienced levels. Analysis of actual and 
test case power flows demonstrates that inter¬ 
regional power transfers had a minimal effect on 
the transmission corridor containing the Har- 
ding-Chamberlin, Hanna-Juniper, and Star-South 
Canton 345-kV lines on August 14. It was the 
increasing native load relative to the limited 
amount of reactive power available in the Cleve¬ 
land-Akron area that caused the depletion of reac¬ 
tive power reserves and declining voltages. 

On August 14, the flow of power through the 
ECAR region as a whole (lower Michigan, Indiana, 
Ohio, Kentucky, West Virginia, and western Penn¬ 
sylvania) was heavy as a result of transfers of 
power from the south (Tennessee, etc.) and west 
(Wisconsin, Minnesota, Illinois, Missouri, etc.) to 
the north (Ohio, Michigan, and Ontario) and east 
(New York, Pennsylvania). The destinations for 
much of the power were northern Ohio, Michigan, 
PJM, and Ontario. This is shown in Figure 4.4, 
which shows the flows between control areas on 
August 14 based on power flow simulations just 
before the Harding-Chamberlin line tripped at 
15:05 EDT. FE’s total load peaked at 12,165MW at 
16:00 EDT. Actual system data indicate that 
between 15:00 and 16:00 EDT, actual line flows 
into FE’s control area were 2,695 MW for both 
transactions and native load. 


Cause 1 


Inadequate 

System 

Understanding 


Table 4.2. Benchmarking Model Results to Actual 


FE Circuit 

MVA Comparison 

Benchmark Accuracy 

From 

To 

Model Base Case MVA 

Actual 8/14 MVA 

Chamberlin 

Harding 

482 

500 

3.6% 

Hanna 

Juniper 

1,009 

1,007 

0.2% 

S. Canton 

Star 

808 

810 

0.2% 

Tidd 

Canton Central 

633 

638 

0.8% 

Sammis 

Star 

728 

748 

2.7% 
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Figure 4.4. Generation, Demand, and Interregional Power Flows on August 14, 2003, at 15:05 EDT 



Figure 4.5 shows total scheduled imports for the 
entire northeast region for June through August 
14, 2003. These transfers were well within the 
range of previous levels, as shown in Figure 4.5, 
and well within all established limits. In particu¬ 
lar, on August 14 increasing amounts of the grow¬ 
ing imports into the area were being delivered to 
FirstEnergy’s Ohio territory to meet its increasing 
demand and to replace the generation lost with the 
trip of Eastlake 5. The level of imports into Ontario 
from the U.S. on August 14 was high (e.g., 1,334 
MW at 16:00 EDT through the New York and 
Michigan ties) but not unusual, and well within 
IMO’s import capability. Ontario is a frequent 
importer and exporter of power, and had imported 
similar and higher amounts of power several times 
during the summers of 2002 and 2003. PJM and 
Michigan also routinely import and export power 
across ECAR. 

Some have suggested that the level of power flows 
into and across the Midwest was a direct cause of 
the blackout on August 14. Investigation team 
modeling proves that these flows were neither a 
cause nor a contributing factor to the blackout. 
The team used detailed modeling and simulation 
incorporating the NERC TagNet data on actual 


Figure 4.5. Scheduled Imports and Exports for 
the Northeast Central Region, June 1 through 
August 13, 2003 



Hour (EDT) 


Note: These flows from within the Northeast Central Area 
include ECAR, PJM, IMO, NYISO, and exclude transfers from 
Quebec, the Maritimes and New England, since the latter areas 
had minimal flows across the region of interest. 

transactions to determine whether and how the 
transactions affected line loadings within the 
Cleveland-Akron area. The MUST (Managing Uti¬ 
lization of System Transmission) analytical tool 
uses the transactions data from TagNet along with 
a power flow program to determine the impact of 
transactions on the loading of transmission 
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flowgates or specific facilities, calculating transfer 
distribution factors across the various flowgates. 
The MUST analysis shows that for actual flows at 
15:05 EDT, only 10% of the loading on Cleve¬ 
land-Akron lines was for through flows for which 
FE was neither the importer nor exporter. 

According to real-time TagNet records, at 15:05 
EDT the incremental flows due to transactions 
were approximately 2,800 MW flowing into the 
FirstEnergy control area and approximately 800 
MW out of FE to Duquesne Light Company 
(DLCO). Among the flows into or out of the FE 
control area, the bulk of the flows were for transac¬ 
tions where FE was the recipient or the source—at 
15:05 EDT the incremental flows due to transac¬ 
tions into FE were 1,300 MW from interconnec¬ 
tions with PJM, AEP, DPL and MECS, and 
approximately 800 MW from interconnections 
with DLCO. But not all of that energy moved 
through the Cleveland-Akron area and across the 
lines which failed on August 14, as Figure 4.6 
shows. 

Figure 4.6 shows how all of the transactions flow¬ 
ing across the Cleveland-Akron area on the after¬ 
noon of August 14 affected line loadings at key FE 
facilities, organized by time and types of transac¬ 
tions. It shows that before the first transmission 
line failed, the bulk of the loading on the four criti¬ 
cal FirstEnergy circuits—Harding-Chamberlin, 
Hanna-Juniper, Star-South Canton and Sammis- 
Star—was to serve Cleveland-Akron area native 
load. Flows to serve native load included transfers 
from FE’s 1,640 MW Beaver Valley nuclear power 
plant and its Seneca plant, both in Pennsylvania, 
which have been traditionally counted by 
FirstEnergy not as imports but rather as in-area 

Figure 4.6. Impacts of Transactions Flows on 
Critical Line Loadings, August 14, 2003 



generation, and as such excluded from TLR cur¬ 
tailments. An additional small increment of line 
loading served transactions for which FE was 
either the importer or exporter, and the remaining 
line loading was due to through-flows initiated 
and received by other entities. The Star-South 
Canton line experienced the greatest impact from 
through-flows—148 MW, or 18% of the total line 
loading at 15:05 EDT, was due to through-flows 
resulting from non-FE transactions. By 15:41 EDT, 
right before Star-South Canton tripped—without 
being overloaded—the Sammis-Star line was serv¬ 
ing almost entirely native load, with loading from 
through-flows down to only 4.5%. 


Cause 1 


Inadequate 
System 
Understanding 


The central point of this analysis 
is that because the critical lines 
were loaded primarily to serve 
native load and FE-related flows, 
attempts to reduce flows through 
transaction curtailments in and around the Cleve¬ 
land-Akron area would have had minimal impact 
on line loadings and the declining voltage situa¬ 
tion within that area. Rising load in the Cleve¬ 
land-Akron area that afternoon was depleting the 
remaining reactive power reserves. Since there 
was no additional in-area generation, only in-area 
load cuts could have reduced local line loadings 
and improved voltage security. This is confirmed 
by the loadings on the 
Sammis-Star at 15:42 EDT, 
after the loss of Star-South 
Canton—fully 96% of the current on that line was 
to serve FE load and FE-related transactions, and a 
cut of every non-FE through transaction flowing 
across northeast Ohio would have obtained only 
59 MW (4%) of relief for this specific line. This 
means that redispatch of generation beyond north¬ 
east Ohio would have had almost no impact upon 
conditions within the Cleveland-Akron area 
(which after 13:31 EDT had no remaining genera¬ 
tion reserves). Equally important, cutting flows on 
the Star-South Canton line might not have 
changed subsequent events—because the line 
opened three times that afternoon due to tree con¬ 
tacts, reducing its loading would not have assured 
its continued operation. 


Recommendations 


3, page 143; 23, page 160 I 


Power flow patterns on August 14 did not cause 
the blackout in the Cleveland-Akron area. But 
once the first four FirstEnergy lines went down, 
the magnitude and pattern of flows on the overall 
system did affect the ultimate path, location and 
speed of the cascade after 16:05:57 EDT. 3 
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Voltages and Voltage Criteria 

During the days before August 14 and throughout 
the morning and mid-day on August 14, voltages 
were depressed across parts of northern Ohio 
because of high air conditioning demand and 
other loads, and power transfers into and to a 
lesser extent across the region. Voltage varies by 
location across an electrical region, and operators 
monitor voltages continuously at key locations 
across their systems. 

Entities manage voltage using long-term planning 
and day-ahead planning for adequate reactive 
supply, and real-time adjustments to operating 
equipment. On August 14, for example, PJM 
implemented routine voltage management proce¬ 
dures developed for heavy load conditions. 
Within Ohio, FE began preparations early in the 
afternoon of August 14, requesting capacitors to 
be restored to service 4 and additional voltage sup¬ 
port from generators. 5 As the day progressed, 
operators across the region took additional 
actions, such as increasing plants’ reactive power 
output, plant redispatch, and transformer tap 
changes to respond to changing voltage 
conditions. 

Voltages at key FirstEnergy buses (points at which 
lines, generators, transformers, etc., converge) 


were declining over the afternoon of August 14. 
Actual measured voltage levels at the Star bus and 
others on FE’s transmission system on August 14 
were below 100% starting early in the day. At 
11:00 EDT, voltage at the Star bus equaled 98.5%, 
declined to 97.3% after the loss of Eastlake 5 at 
13:31 EDT, and dropped to 95.9% at 15:05 EDT 
after the loss of the Harding-Chamberlin line. 
FirstEnergy system operators reported this voltage 
performance to be typical for a warm summer day 
on the FirstEnergy system. The gradual decline of 
voltage over the early afternoon was consistent 
with the increase of load over the same time 
period, particularly given that FirstEnergy had no 
additional generation within the Cleveland-Akron 
area load pocket to provide additional reactive 
support. 

NERC and regional reliability 
councils’ planning criteria and 
operating policies (such as NERC 
I.A and I.D, NPCC A-2, and ECAR 
Document 1) specify voltage crite¬ 
ria in such generic terms as: acceptable voltages 
under normal and emergency conditions shall be 
maintained within normal limits and applicable 
emergency limits respectively, with due recogni¬ 
tion to avoiding voltage instability and wide¬ 
spread system collapse in the event of certain 
contingencies. Each system then defines its own 


Cause 1 


Inadequate 

System 

Understanding 


Do ATC and TTC Matter for Reliability? 

Each transmission provider calculates Available 
Transfer Capability (ATC) and Total Transfer 
Capability (TTC) as part of its Open Access 
Transmission Tariff, and posts those on the 
OASIS to enable others to plan power purchase 
transactions. TTC is the forecast amount of elec¬ 
tric power that can be transferred over the inter¬ 
connected transmission network in a reliable 
manner under specific system conditions. ATCs 
are forecasts of the amount of transmission avail¬ 
able for additional commercial trade above pro¬ 
jected committed uses. These are not real-time 
operating security limits for the grid. 

The monthly TTC and ATC values for August 
2003 were first determined a year previously; 
those for August 14, 2003 were calculated 30 
days in advance; and the hourly TTC and ATC 
values for the afternoon of August 14 were calcu¬ 
lated approximately seven days ahead using fore¬ 
casted system conditions. Each of these values 
should be updated as the forecast of system 


conditions changes. Thus the TTC and ATC are 
advance estimates for commercial purposes and 
do not directly reflect actual system conditions. 
NERC’s operating procedures are designed to 
manage actual system conditions, not forecasts 
such as ATC and TTC. 

Within ECAR, ATCs and TTCs are determined on 
a first contingency basis, assuming that only the 
most critical system element may be forced out of 
service during the relevant time period. If actual 
grid conditions—loads, generation dispatch, 
transaction requests, and equipment availabil¬ 
ity—differ from the conditions assumed previ¬ 
ously for the ATC and TTC calculation, then the 
ATC and TTC have little relevance for actual sys¬ 
tem operations. Regardless of what pre-calcu- 
lated ATC and TTC levels may be, system 
operators must use real-time monitoring and 
contingency analysis to track and respond to 
real-time facility loadings to assure that the 
transmission system is operated reliably. 
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acceptable voltage criteria based on its own sys¬ 
tem design and equipment characteristics, detail¬ 
ing quantified measures including acceptable 
minimum and maximum voltages in percentages 
of nominal voltage and acceptable voltage 


declines from the pre-contingency voltage. Good 
utility practice requires that these determinations 
be based on a full set of V-Q (voltage performance 
V relative to reactive power supply Q) and P-V 
(real power transfer P relative to voltage V) 


Competition and Increased Electric Flows 

Besides blaming high inter-regional power flows 
for causing the blackout, some blame the exis¬ 
tence of those power flows upon wholesale elec¬ 
tric competition. Before 1978, most power plants 
were owned by vertically-integrated utilities; 
purchases between utilities occurred when a 
neighbor had excess power at a price lower than 
other options. A notable increase in inter-region¬ 
al power transfers occurred in the mid-1970s 
after the oil embargo, when eastern utilities with 
a predominance of high-cost oil-fired generation 
purchased coal-fired energy from Midwestern 
generators. The 1970s and 1980s also saw the 
development of strong north-to-south trade 
between British Columbia and California in the 
west, and Ontario, Quebec, and New York-New 
England in the east. Americans benefited from 
Canada’s competitively priced hydroelectricity 
and nuclear power while both sides gained from 
seasonal and daily banking and load balancing— 
Canadian provinces had winter peaking loads 
while most U.S. utilities had primarily summer 
peaks. 

In the United States, wholesale power sales by 
independent power producers (IPPs) began after 
passage of the Public Utility Regulatory Policy 
Act of 1978, which established the right of 
non-utility producers to operate and sell their 
energy to utilities. This led to extensive IPP 
development in the northeast and west, increas¬ 
ing in-region and inter-regional power sales as 
utility loads grew without corresponding utility 
investments in transmission. In 1989, investor- 
owned utilities purchased 17.8% of their total 
energy (self-generation plus purchases) from 
other utilities and IPPs, compared to 37.3% in 
2002; and in 1992, large public power entities 
purchased 36.3% of total energy (self-generation 
plus purchases), compared to 40.5% in 2002. a 

In the Energy Policy Act of 1992, Congress 
continued to promote the development of 


competitive energy markets by introducing 
exempt wholesale generators that would com¬ 
pete with utility generation in wholesale electric 
markets (see Section 32 of the Public Utility 
Holding Company Act). Congress also broadened 
the authority of the Federal Energy Regulatory 
Commission to order transmission access on a 
case-by-case basis under Section 211 of the Fed¬ 
eral Power Act. Consistent with this Congressio¬ 
nal action, the Commission in Order 888 ordered 
all public utilities that own, operate, or control 
interstate transmission facilities to provide open 
access for sales of energy transmitted over those 
lines. 

Competition is not the only thing that has grown 
over the past few decades. Between 1986 and 
2002, peak demand across the United States grew 
by 26%, and U.S. electric generating capacity 
grew by 22%, b but U.S. transmission capacity 
grew little beyond the interconnection of new 
power plants. Specifically, “the amount of trans¬ 
mission capacity per unit of consumer demand 
declined during the past two decades and ... is 
expected to drop further in the next decade.” c 

Load-serving entities today purchase power for 
the same reason they did before the advent of 
competition—to serve their customers with low- 
cost energy—and the U.S. Department of Energy 
estimates that Americans save almost $13 billion 
(U.S.) annually on the cost of electricity from the 
opportunity to buy from distant, economical 
sources. But it is likely that the increased loads 
and flows across a transmission grid that has 
experienced little new investment is causing 
greater “stress upon the hardware, software and 
human beings that are the critical components of 
the system.” d A thorough study of these issues 
has not been possible as part of the Task Force’s 
investigation, but such a study would be worth¬ 
while. For more discussion, see Recommenda¬ 
tion 12, page 148. 


a RDI PowerDat database. 

b U.S. Energy Information Administration, Energy Annual Data Book, 2003 edition. 
c Dr. Eric Hirst, “Expanding U.S. Transmission Capacity,” August 2000, p. vii. 

^Letter from Michael H. Dworkin, Chairman, State of Vermont Public Service Board, February 11, 2004, to Alison Silverstein 
and Jimmy Glotfelty. 
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analyses for a wide range of system conditions. 
Table 4.3 compares the voltage criteria used by 
FirstEnergy and other relevant transmission oper¬ 
ators in the region. As this table shows, FE uses 
minimum acceptable normal voltages which are 
lower than and incompati¬ 
ble with those used by its 
interconnected neighbors. 


Recommendation 


23, page 160 


The investigation team probed 
deeply into voltage management 
issues within the Cleveland- 
Akron area. As noted previously, 
a power system with higher oper¬ 
ating voltage and larger reactive power reserves is 
more resilient or robust in the face of load 
increases and operational contingencies. Higher 
transmission voltages enable higher power trans¬ 
fer capabilities and reduce transmission line 
losses (both real and reactive). For the Cleve- 
land-Akron area, FE has been operating the system 
with the minimum voltage level at 90% of nominal 
rating, with alarms set at 92%. 6 The criteria allow 
for a single contingency to occur if voltage remains 
above 90%. The team conducted extensive voltage 
stability studies (discussed below), concluding 
that FE’s 90% minimum voltage level was not only 
far less stringent than nearby interconnected sys¬ 
tems (most of which set the pre-contingency mini¬ 
mum voltage criteria at 95%), but was not 
adequate for secure system operations. 

Examination of the Form 715 filings made by Ohio 
Edison, FE’s predecessor company, for 1994 
through 1997 indicate that Ohio Edison used a 
pre-contingency bus voltage criteria of 95 to 105 % 
and 90% emergency post-contingency voltage, 
with acceptable change in voltage no greater than 
5%. These historic criteria were compatible with 
neighboring transmission operator practices. 

A look at voltage levels across the region illus¬ 
trates the difference between FE’s voltage 
situation on August 14 and that of its neighbors. 


Cause 1 


Inadequate 

System 

Understanding 


Figure 4.7 shows the profile of 
voltage levels at key buses from 
southeast Michigan across Ohio 
into western Pennsylvania from 
August 11 through 14 and for sev¬ 
eral hours on August 14. These transects show 
that across the area, voltage levels were consis¬ 
tently lower at the 345-kV buses in the Cleve- 
land-Akron area (from Beaver to Hanna on the 
west to east plot and from Avon Lake to Star on the 
north to south plot) for the three days and the 
13:00 to 15:00 EDT period preceding the blackout. 
Voltage was consistently and considerably higher 
at the outer ends of each transect, where it never 
dropped below 96% even on August 14. These 
profiles also show clearly the decline of voltage 
over the afternoon of August 14, with voltage at 
the Harding bus at 15:00 EDT just below 96% 
before the Harding-Chamberlin line tripped at 
15:05 EDT, and dropping down to around 93% at 
16:00 EDT after the loss of lines and load in the 
immediate area. 


Cause 1 


Inadequate 

System 

Understanding 


Using actual data provided by FE, 
ITC, AEP and PJM, Figure 4.8 
shows the availability of reactive 
reserves (the difference between 
reactive power generated and the 
maximum reactive capability) within the Cleve¬ 
land-Akron area and four regions surrounding it, 
from ITC to PJM. On the afternoon of August 14, 
the graph shows that reactive power generation 
was heavily taxed in the Cleveland-Akron area but 
that extensive MVAr reserves were available in 
the neighboring areas. As the afternoon pro¬ 
gressed, reactive reserves diminished for all five 
regions as load grew. But reactive reserves were 
fully depleted within the Cleveland-Akron area by 
16:00 EDT without drawing down the reserves in 
neighboring areas, which remained at scheduled 
voltages. The region as a whole had sufficient 
reactive reserves, but because reactive power can¬ 
not be transported far but must be supplied from 
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Table 4.3. Comparison of Voltage Criteria (Percent) 


345 kV/138 kV 

FE 

PJM 

AEP 

METC a 

ITC b 

MISO 

IMO c 

High. 

105 

105 

105 

105 

105 

105 

110 

Normal Low. 

90 

95 

95 

97 

95 

95 

98 

Emergency/Post N-1 Low. 

90 

92 

90 d 


87 


94 

Maximum N-1 deviation. 

5 e 



5 



10 


a Applies to 138 kV only. 345 kV not specified. 

b Applies to 345 kV only. Min-max normal voltage for 120 kV and 230 kV is 93-105%. 
c 500 kV. 

d 92%for 138 kV. 
e 10% for 138 kV. 
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Figure 4.7. Actual Voltages Across the Ohio Area Before and On August 14, 2003 




West-East 345kV Actual (Measured) Voltages Leading up to August 14th 
(4:00 PM EDT) 



North-South 345kV Actual (Measured) Voltages Leading up to August 14th 
(4:00 PM EDT) 



Bus 


Hourly West-East 345kV Actual (Measured) Voltages on August 14th 


Hourly North-South 345kV Actual (Measured) Voltages on August 14th 
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Voltage Stability Analysis 

Voltage instability or voltage collapse occurs on a 
power system when voltages progressively 
decline until stable operating voltages can no 
longer be maintained. This is precipitated by an 
imbalance of reactive power supply and demand, 
resulting from one or more changes in system 
conditions including increased real or reactive 
loads, high power transfers, or the loss of genera¬ 
tion or transmission facilities. Unlike the phe¬ 
nomenon of transient instability, where 
generators swing out of synchronism with the 
rest of the power system within a few seconds or 
less after a critical fault, voltage instability can 
occur gradually within tens of seconds or 
minutes. 

Voltage instability is best studied using V-Q 
(voltage relative to reactive power) and P-V (real 
power relative to voltage) analysis. V-Q analysis 
evaluates the reactive power required at a bus to 
maintain stable voltage at that bus. A simulated 
reactive power source is added to the bus, the 
voltage schedule at the bus is adjusted in small 
steps from an initial operating point, and power 
flows are solved to determine the change in reac¬ 
tive power demand resulting from the change 
in voltage. Under stable operating conditions, 
when voltage increases the reactive power 
requirement also increases, and when voltage 


falls the reactive requirement also falls. But when 
voltage is lowered at the bus and the reactive 
requirement at that bus begins to increase (rather 
than continuing to decrease), the system 
becomes unstable. The voltage point correspond¬ 
ing to the transition from stable to unstable con¬ 
ditions is known as the “critical voltage,” and the 
reactive power level at that point is the “reactive 
margin.” The desired operating voltage level 
should be well above the critical voltage with a 
large buffer for changes in prevailing system con¬ 
ditions and contingencies. Similarly, reactive 
margins should be large to assure robust voltage 
levels and secure, stable system performance. 

The illustration below shows a series of V-Q 
curves. The lowest curve, A, reflects baseline 
conditions for the grid with all facilities avail¬ 
able. Each higher curve represents the same 
loads and transfers for the region modeled, but 
with another contingency event (a circuit loss) 
occurring to make the system less stable. With 
each additional contingency, the critical voltage 
rises (the point on the horizontal axis corre¬ 
sponding to the lowest point on the curve) and 
the reactive margin decreases (the difference 
between the reactive power at the critical voltage 
and the zero point on the vertical axis). This 
means the system is closer to instability. 


V-Q (Voltage-Reactive Power) Curves 
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Voltage Stability Analysis (Continued) 

V-Q analyses and experience with heavily loaded 
power systems confirm that critical voltage levels 
can rise above the 95% level traditionally consid¬ 
ered as normal. Thus voltage magnitude alone is 
a poor indicator of voltage stability and V-Q anal¬ 
ysis must be carried out for several critical buses 
in a local area, covering a range of load and gener¬ 
ation conditions and known contingencies that 
affect voltages at these buses. 

P-V analysis (real power relative to voltage) is a 
companion tool which determines the real power 
transfer capability across a transmission inter¬ 
face for load supply or a power transfer. Starting 
from a base case system state, a series of load 
flows with increasing power transfers are solved 
while monitoring voltages at critical buses. 
When power transfers reach a high enough level 
a stable voltage cannot be sustained and the 
power flow model fails to solve. The point where 
the power flow last solved corresponds to the 
critical voltage level found in the V-Q curve for 
those conditions. On a P-V curve (see below), this 
point is called the “nose” of the curve. 

This set of P-V curves illustrates that for baseline 
conditions shown in curve A, voltage remains 
relatively steady (change along the vertical axis) 
as load increases within the region (moving out 
along the horizontal axis). System conditions are 
secure and stable in the area above the “nose” of 


the curve. After a contingency occurs, such as a 
transmission circuit or generator trip, the new 
condition set is represented by curve B, with 
lower voltages (relative to curve A) for any load 
on curve B. As the operator’s charge is to keep the 
system stable against the next worst contingency, 
the system must be operated to stay well inside 
the load level for the nose of curve B. If the B con¬ 
tingency occurs, there is a next worst contin¬ 
gency curve inside curve B, and the operator 
must adjust the system to pull back operations to 
within the safe, buffered space represented by 
curve C. 

The investigation team conducted extensive V-Q 
and P-V analyses for the area around Cleve- 
land-Akron for the conditions in effect on August 
14, 2003. Team members examined over fifty 
345-kV and 138-kV buses across the systems of 
FirstEnergy, AEP, International Transmission 
Company, Duquesne Light Company, Alleghany 
Power Systems and Dayton Power & Light. The 
V-Q analysis alone involved over 10,000 power 
flow simulations using a system model with 
more than 43,000 buses and 57,000 lines and 
transformers. The P-V analyses used the same 
model and data sets. Both examined conditions 
and combinations of contingencies for critical 
times before and after key events on the 
FirstEnergy system on the day of the blackout. 
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local sources, these healthy reserves nearby could 
not support the Cleveland-Akron area’s reactive 
power deficiency and growing voltage problems. 
Even FE’s own generation in the Ohio Valley had 
reactive reserves that could not support the sag¬ 
ging voltages inside the Cleveland-Akron area. 


Cause 1 


Inadequate 
System 
Understanding 


An important consideration in 
reactive power planning is to 
ensure an appropriate balance 
between static and dynamic reac¬ 
tive power resources across the 
interconnected system (as specified in NERC 
Planning Standard ID.Si). With so little genera¬ 
tion left in the Cleveland-Akron area on August 
14, the area’s dynamic reactive reserves were 
depleted and the area relied heavily on static com¬ 
pensation to respond to changing system condi¬ 
tions and support voltages. But a system relying 
on static compensation can experience a gradual 
voltage degradation followed by a sudden drop in 
voltage stability—the P-V curve for such a system 
has a very steep slope close to the nose, where 
voltage collapses. On August 14, the lack of ade¬ 
quate dynamic reactive reserves, coupled with not 
knowing the critical voltages and maximum 
import capability to serve 
native load, left the Cleve¬ 
land-Akron area in a very 
vulnerable state. 


Recommendation 


23, page 160 


Past System Events 
and Adequacy of System Studies 

In June 1994, with three genera¬ 
tors in the Cleveland area out on 
maintenance, inadequate reactive 
reserves and falling voltages in 
the Cleveland area forced Cleve¬ 
land Electric Illuminating (CEI, a predecessor 
company to FirstEnergy) to shed load within 
Cleveland (a municipal utility and wholesale 
transmission and purchase customers within 
CEI’s control area) to avoid voltage collapse. 7 The 
Cleveland-Akron area’s voltage problems were 
well-known and reflected in the stringent voltage 
criteria used by control area operators until 1998. 8 

In the summer of 2002, AEP’s 
South Canton 765 kV to 345 kV 
transformer (which connects to 
FirstEnergy’s Star 345-kV line) 
experienced eleven days of severe 
overloading when actual loadings exceeded nor¬ 
mal rating and contingency loadings were at or 
above summer emergency ratings. In each 
instance, AEP took all available actions short of 
load shedding to return the system to a secure 
state, including TLRs, switching, and dispatch 
adjustments. These excessive loadings were 
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Figure 4.8. Reactive Reserves Around Ohio on August 14, 2003, for Representative Generators in the Area 
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Note: These reactive reserve MVAr margins were calculated for the five regions for the following plants: (1) Cleveland area of 
FirstEnergy—Ashtabula 5, Perry 1, Eastlake 1, Eastlake 3, Lakeshore 18; (2) Northern central portion of AEP near FirstEnergy 
(South-Southeast of Akron)—Cardinal 1, Cardinal 2, Cardinal 3, Kammer 2, Kammer 3; (3) Southwest area of MECS (ITC)—Fermi 
1, Monroe 2, Monroe 3, Monroe 4; (4) Ohio Valley portion of FirstEnergy—Sammis 4, Sammis 5, Sammis 6, Sammis 7; (5) Western 
portion of PJM—Keystone 1, Conemaugh 1, Conemaugh 2. 
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calculated to have diminished the remaining life 
of the transformer by 30%. AEP replaced this sin¬ 
gle phase transformer in the winter of 2002-03, 
marginally increasing the capacity of the South 
Canton transformer bank. 

Following these events, AEP conducted extensive 
modeling to understand the impact of a potential 
outage of this transformer. That modeling re¬ 
vealed that loss of the South Canton transformer, 


especially if it occurred in combination with 
outages of other critical facilities, would cause sig¬ 
nificant low voltages and overloads on both the 
AEP and FirstEnergy systems. AEP shared these 
findings with FirstEnergy in a meeting on January 
10, 2003. 9 

AEP subsequently completed a set of system stud¬ 
ies, including long range studies for 2007, which 
included both single contingency and extreme 


Independent Power Producers and Reactive Power 


Independent power producers (IPPs) are power 
plants that are not owned by utilities. They oper¬ 
ate according to market opportunities and their 
contractual agreements with utilities, and may or 
may not be under the direct control of grid opera¬ 
tors. An IPP’s reactive power obligations are 
determined by the terms of its contractual inter¬ 
connection agreement with the local transmis¬ 
sion owner. Under routine conditions, some IPPs 
provide limited reactive power because they are 
not required or paid to produce it; they are only 
paid to produce active power. (Generation of 
reactive power by a generator can require scaling 
back generation of active power.) Some con¬ 
tracts, however, compensate IPPs for following a 
voltage schedule set by the system operator, 
which requires the IPP to vary its output of reac¬ 
tive power as system conditions change. Further, 
contracts typically require increased reactive 
power production from IPPs when it is requested 


by the control area operator during times of a sys¬ 
tem emergency. In some contracts, provisions 
call for the payment of opportunity costs to IPPs 
when they are called on for reactive power (i.e., 
they are paid the value of foregone active power 
production). 

Thus, the suggestion that IPPs may have contrib¬ 
uted to the difficulties of reliability management 
on August 14 because they don’t provide reactive 
power is misplaced. What the IPP is required to 
produce is governed by contractual arrange¬ 
ments, which usually include provisions for con¬ 
tributions to reliability, particularly during 
system emergencies. More importantly, it is the 
responsibility of system planners and operators, 
not IPPs, to plan for reactive power requirements 
and make any short-term arrangements needed 
to ensure that adequate reactive power resources 
will be available. 


Power Flow Simulation of Pre-Cascade Conditions 

The bulk power system has no memory. It does 
not matter if frequencies or voltage were unusual 
an hour, a day, or a month earlier. What matters 
for reliability are loadings on facilities, voltages, 
and system frequency at a given moment and the 
collective capability of these system components 
at that same moment to withstand a contingency 
without exceeding thermal, voltage, or stability 
limits. 

Power system engineers use a technique called 
power flow simulation to reproduce known oper¬ 
ating conditions at a specific time by calibrating 
an initial simulation to observed voltages and 
line flows. The calibrated simulation can then be 
used to answer a series of “what if’ questions to 
determine whether the system was in a safe oper¬ 
ating state at that time. The “what if” questions 
consist of systematically simulating outages by 
removing key elements (e.g., generators or trans¬ 


mission lines) one by one and reassessing the 
system each time to determine whether line or 
voltage limits would be exceeded. If a limit is 
exceeded, the system is not in a secure state. As 
described in Chapter 2, NERC operating policies 
require operators, upon finding that their system 
is not in a reliable state, to take immediate 
actions to restore the system to a reliable state as 
soon as possible and within a maximum of 30 
minutes. 

To analyze the evolution of the system on the 
afternoon of August 14, this process was fol¬ 
lowed to model several points in time, corre¬ 
sponding to key transmission line trips. For each 
point, three solutions were obtained: (1) condi¬ 
tions immediately before a facility tripped off; (2) 
conditions immediately after the trip; and (3) 
conditions created by any automatic actions 
taken following the trip. 
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disturbance possibilities. These studies showed 
that with heavy transfers to the north, expected 
overloading of the South Canton transformer and 
depressed voltages would occur following the loss 
of the Perry unit and the loss of the Tidd-Canton 
Central 345-kV line, and probable cascading into 
voltage collapse across northeast Ohio would 
occur for nine different double contingency com¬ 
binations of generation and transmission or trans¬ 
mission and transmission outages. 10 AEP shared 
these findings with FirstEnergy in a meeting on 
May 21, 2003. Meeting notes indicate that “neither 
AEP or FE were able to identify any changes in 
transmission configuration or operating proce¬ 
dures which could be used during 2003 summer 
to be able to control power flows through the S. 
Canton bank.” 11 Meeting notes include an action 
item that both “AEP and FE would share the 
results of these studies and expected performance 
for 2003 summer with their Management and 
Operations personnel.” 12 

Reliability coordinators and control areas prepare 
regional and seasonal studies for a variety of sys¬ 
tem-stressing scenarios, to better understand 
potential operational situations, vulnerabilities, 
risks, and solutions. However, the studies 
FirstEnergy relied on—both by FirstEnergy and 
ECAR—were not robust, thorough, or up-to-date. 
This left FE’s planners and operators with a defi¬ 
cient understanding of their system’s capabilities 
and risks under a range of system conditions. 
None of the past voltage events noted above or the 
significant risks identified in AEP’s 2002-2003 
studies are reflected in any FirstEnergy or ECAR 
seasonal or longer-term planning studies or oper¬ 
ating protocols available to the investigation team. 

FE’s 2003 Summer Study focused 
primarily on single-contingency 
(N-l) events, and did not consider 
significant multiple contingency 
losses and security. FirstEnergy 
examined only thermal limits and looked at volt¬ 
age only to assure that voltage levels remained 
within range of 90 to 105% of nominal voltage on 
the 345 kV and 138 kV network. The study 
assumed that only the Davis-Besse power plant 
(883 MW) would be out of service at peak load of 
13,206 MW; on August 14, peak load reached 
12,166 MW and scheduled generation outages 
included Davis-Besse, Sammis 3 (180 MW) and 
Eastlake 4 (240 MW), with Eastlake 5 (597 MW) 
lost in real time. The study assumed that all trans¬ 
mission facilities would be in service; on August 
14, scheduled transmission outages included the 
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Eastlake #62 345/138 kV transformer and the Fox 
#1 138-kV capacitor, with other capacitors down 
in real time. Last, the study assumed a single set of 
import and export conditions, rather than testing a 
wider range of generation dispatch, import-export, 
and inter-regional transfer conditions. Overall, the 
summer study posited less stressful system condi¬ 
tions than actually occurred August 14, 2003 
(when load was well below historic peak demand). 
It did not examine system sensitivity to key 
parameters to determine system operating limits 
within the constraints of transient stability, volt¬ 
age stability, and thermal 
capability. 


Recommendation 


23, page 160 
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FirstEnergy has historically relied 
upon the ECAR regional assess¬ 
ments to identify anticipated 
reactive power requirements and 
recommended corrective actions. 
But ECAR over the past five years has not con¬ 
ducted any detailed analysis of the Cleveland- 
Akron area and its voltage-constrained import 
capability—although that constraint had been an 
operational consideration in the 1990s and was 
documented in testimony filed in 1996 with the 
Federal Energy Regulatory Commission. 13 The 
voltage-constrained import capability was not 
studied; FirstEnergy had modified the criteria 
around 1998 and no longer followed the tighter 
voltage limits used earlier. In the ECAR “2003 
Summer Assessment of Transmission System Per¬ 
formance,” dated May 2003, First Energy’s Indi¬ 
vidual Company Assessment identified potential 
overloads for the loss of both Star 345/138 trans¬ 
formers, but did not men¬ 
tion any expected voltage 
limitation. 


Recommendation 


3, page 143 


FE participates in ECAR studies that evaluate 
extreme contingencies and combinations of 
events. ECAR does not conduct exacting region¬ 
wide analyses, but compiles individual members’ 
internal studies of N-2 and multiple contingencies 
(which may include loss of more than one circuit, 
loss of a transmission corridor with several trans¬ 
mission lines, loss of a major substation or genera¬ 
tor, or loss of a major load pocket). The last such 
study conducted was published in 2000, project¬ 
ing system conditions for 2003. That study did not 
include any contingency cases that resulted in 
345-kV line overloading or voltage violations on 
345-kV buses. FE reported no evidence of a risk of 
cascading, but reported that some local load 
would be lost and generation redispatch would be 
needed to alleviate some thermal overloads. 
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ECAR and Organizational Independence 

ECAR was established in 1967 as a regional reli¬ 
ability council, to “augment the reliability of the 
members’ electricity supply systems through 
coordination of the planning and operation of the 
members’ generation and transmission facili¬ 
ties.” 3 ECAR’s membership includes 29 major 
electricity suppliers serving more than 36 mil¬ 
lion people. 

ECAR’s annual budget for 2003 was $5.15 mil¬ 
lion (U.S.), including $1,775 million (U.S.) paid 
to fund NERC. b These costs are funded by its 
members in a formula that reflects megawatts 
generated, megawatt load served, and miles of 
high voltage lines. AEP, ECAR’s largest member, 
pays about 15% of total ECAR expenses; 
FirstEnergy pays approximately 8 to 10%. c 

Utilities “whose generation and transmission 
have an impact on the reliability of the intercon¬ 
nected electric systems” of the region are full 
ECAR members, while small utilities, independ¬ 
ent power producers, and marketers can be asso¬ 
ciate members. d Its Executive Board has 22 seats, 
one for each full member utility or major supplier 
(including every control area operator in ECAR). 
Associate members do not have voting rights, 
either on the Board or on the technical commit¬ 
tees which do all the work and policy-setting for 
the ECAR region. 

All of the policy and technical decisions for 
ECAR, including all interpretations of NERC 
guidelines, policies, and standards within ECAR, 
are developed by committees (called “panels”), 
staffed by representatives from the ECAR mem¬ 
ber companies. Work allocation and leadership 
within ECAR are provided by the Board, the 
Coordination Review Committee, and the Market 
Interface Committee. 


ECAR has a staff of 18 full-time employees, head¬ 
quartered in Akron, Ohio. The staff provides 
engineering analysis and support to the various 
committees and working groups. Ohio Edison, a 
FirstEnergy subsidiary, administers salary, bene¬ 
fits, and accounting services for ECAR. ECAR 
employees automatically become part of Ohio 
Edison’s (FirstEnergy’s) 401 (k) retirement plan; 
they receive FE stock as a matching share to 
employee 401 (k) investments and can purchase 
FE stock as well. Neither ECAR staff nor board 
members are required to divest stock holdings in 
ECAR member companies.® Despite the close 
link between FirstEnergy’s financial health and 
the interest of ECAR’s staff and management, the 
investigation team has found no evidence to sug¬ 
gest that ECAR staff favor FirstEnergy’s interests 
relative to other members. 


ECAR decisions appear to be dominated by the 
member control areas, which have consistently 
allowed the continuation of past practices within 
each control area to meet NERC requirements, 
rather than insisting on more stringent, consis¬ 
tent requirements for such matters as operating 
voltage criteria or planning studies. ECAR mem¬ 
ber representatives also staff the reliability coun¬ 
cil’s audit program, measuring individual control 
area compliance against local standards and 
interpretations. It is difficult for an entity domi¬ 
nated by its members to find that the members’ 
standards and practices are inadequate. But it 
should also be recognized that NERC’s broadly 
worded and ambiguous standards have enabled 
and facilitated the lax inter¬ 
pretation of reliability re¬ 
quirements within ECAR 
over the years. 


Recommendations 


2, page 143; 3, page 143 


a ECAR “Executive Manager’s Remarks,” http://www.ecar.org. 

interview with Brantley Eldridge, ECAR Executive Manager, March 10, 2004 

interview with Brantley Eldridge, ECAR Executive Manager, March 3, 2004. 

d ECAR “executive Manager’s Remarks,” http://www.ecar.org. 

interview with Brantley Eldridge, ECAR Executive Manager, March 3, 2004. 
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Model-Based Analysis 
of the State of the Regional Power 
System at 15:05 EDT, Before the 
Loss of FE’s Harding-Chamberlin 
345-kV Line 

As the first step in modeling the August 14 black¬ 
out, the investigation team established a base case 
by creating a power flow simulation for the entire 
Eastern Interconnection and benchmarking it to 
recorded system conditions at 15:05 EDT on 
August 14. The team started with a projected sum¬ 
mer 2003 power flow case for the Eastern Inter¬ 
connection developed in the spring of 2003 by the 
Regional Reliability Councils to establish guide¬ 
lines for safe operations for the coming summer. 
The level of detail involved in this region-wide 
power flow case far exceeds that normally consid¬ 
ered by individual control areas and reliability 
coordinators. It consists of a detailed representa¬ 
tion of more than 43,000 buses, 57,600 transmis¬ 
sion lines, and all major generating stations across 
the northern U.S. and eastern Canada. The team 
revised the summer power flow case to match 
recorded generation, demand, and power inter¬ 
change levels among control areas at 15:05 EDT on 
August 14. The benchmarking consisted of match¬ 
ing the calculated voltages and line flows to 
recorded observations at more than 1,500 loca¬ 
tions within the grid. Thousands of hours of effort 
were required to benchmark the model satisfacto¬ 
rily to observed conditions at 15:05 EDT. 

Once the base case was benchmarked, the team 
ran a contingency analysis that considered more 
than 800 possible events—including the loss of 
the Harding-Chamberlin 345-kV line—as points of 
departure from the 15:05 EDT case. None of these 
contingencies resulted in a violation of a transmis¬ 
sion line loading or bus voltage limit prior to the 
trip of FE’s Harding-Chamberlin 345-kV line. That 
is, according to these simulations, the system at 
15:05 EDT was capable of safe operation following 
the occurrence of any of the tested contingencies. 
From an electrical standpoint, therefore, before 
15:05 EDT the Eastern Interconnection was being 
operated within all established limits and in full 
compliance with NERC’s operating policies. How¬ 
ever, after loss of the Harding-Chamberlin 345-kV 
line, the system would have exceeded emergency 
ratings immediately on several lines for two of the 
contingencies studied—in other words, it would 
no longer be operating in compliance with NERC 
Operating Policy A. 2 because it could not be 


brought back into a secure operating condition 
within 30 minutes. 

Perry Nuclear Plant as a 
First Contingency 

Investigation team modeling demonstrates that 
the Perry nuclear unit (1,255 MW near Lake Erie) 
is critical to the voltage stability of the Cleve- 
land-Akron area in general and particularly on 
August 14. The modeling reveals that had Perry 
tripped before 15:05 EDT, voltage levels at key 
FirstEnergy buses would have fallen close to 93% 
with only a 150 MW of area load margin (2% of the 
Cleveland-Akron area load); but had Perry been 
lost after the Harding-Chamberlin line went down 
at 15:05 EDT, the Cleveland-Akron area would 
have been close to voltage collapse. 

Perry and Eastlake 5 together have 
a combined real power capability 
of 1,852 MW and reactive capabil¬ 
ity of 930 MVAr. If one of these 
units is lost, it is necessary to 
immediately replace the lost generation with MW 
and MVAr imports (although reactive power does 
not travel far under heavy loading); without 
quick-start generation or spinning reserves or 
dynamic reactive reserves inside the Cleveland- 
Akron area, system security 
may be jeopardized. On 
August 14, as noted previ¬ 
ously, there were no significant spinning reserves 
remaining within the Cleveland-Akron area fol¬ 
lowing the loss of Eastlake 5 at 13:31 EDT. If Perry 
had been lost FE would have been unable to meet 
the 30-minute security adjustment requirement of 
NERC’s Operating Policy 2, without the ability to 
shed load quickly. The loss of Eastlake 5 followed 
by the loss of Perry are contingencies that should 
be assessed in the operations planning timeframe, 
to develop measures to readjust the system 
between contingencies. Since FirstEnergy did not 
conduct such contingency analysis planning and 
develop these advance measures, it was in viola¬ 
tion of NERC Planning Standard 1A, Category C3. 

This operating condition is not news. Historically, 
the loss of Perry at full output has been recognized 
as FE’s most critical single contingency for the 
Cleveland Electric Illuminating area, as docu¬ 
mented by FE’s 1998 Summer Import Capability 
study. Perry’s MW and MVAr total output capabil¬ 
ity exceeded the import capability of any of the 
critical 345-kV circuits into the Cleveland-Akron 
area after the loss of Eastlake 5 at 13:31 EDT. This 


Cause 1 


Inadequate 

System 

Understanding 
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means that if the Perry plant had been lost on 
August 14 after Eastlake 5 went down—or on 
many other days with similar loads and out¬ 
ages—it would have been difficult or impossible 
for FE operators to adjust the system within 30 
minutes to prepare for the next critical contin¬ 
gency, as required by NERC Operating Policy A. 2. 
In real-time operations, operators would have to 
calculate operating limits and prepare to use the 
last resort of manually shedding large blocks of 
load before the second contingency, or immedi¬ 
ately after it if automatic load-shedding is 
available. 


The investigation team could not 
find FirstEnergy contingency 
plans or operational procedures 
for operators to manage the 
FirstEnergy control area and pro¬ 
tect the Cleveland-Akron area from the unex¬ 
pected loss of the Perry plant. 


Cause 1 


Inadequate 

System 

Understanding 


Recommendation 


23, page 160 


To examine the impact of this worst contingency 
on the Cleveland-Akron area on August 14, Figure 
4.9 shows the V-Q curves for key buses in the 
Cleveland-Akron area at 15:05 EDT, before and 
after the loss of the Har- 
ding-Chamberlin line. The 
curves on the left look at the 
impact of the loss of Perry 

before the Harding-Chamberlin trip, while the 
curves on the right show the impact had the 
nuclear plant been lost after Harding-Chamberlin 
went out of service. Had Perry gone down before 
the Harding-Chamberlin outage, reactive margins 
at key FE buses would have been minimal (with 
the tightest margin at the Harding bus, read along 
the Y-axis) and the critical voltage (the point 
before voltage collapse, read along the X-axis) at 


the Avon bus would have risen to 90.5%—uncom¬ 
fortably close to the limits which FE considered as 
an acceptable operating range. But had the Perry 
unit gone off-line after Harding-Chamberlin, reac¬ 
tive margins at all these buses would have been 
even tighter (with only 60 MVAr at the Harding 
bus), and critical voltage at Avon would have risen 
to 92.5%, worse than FE’s 90% minimum accept¬ 
able voltage. The system at this point would be 
very close to voltage instability. If the first line out¬ 
age on August 14, 2003, had been at Hanna- 
Juniper rather than at Harding-Chamberlin, the 
FirstEnergy system could not have withstood the 
loss of the Perry plant. 

The above analysis assumed load 
levels consistent with August 14. 
But temperatures were not partic¬ 
ularly high that day and loads 
were nowhere near FE’s historic 
load level of 13,229 MW for the control area (in 
August 2002). Therefore the investigation team 
looked at what might have happened in the Cleve¬ 
land-Akron area had loads neared the historic 
peak—approximately 625 MW higher than the 
6,715 MW peak load in the Cleveland-Akron area 
in 2003. Figure 4.10 uses P-V analysis to show the 
impact of increased load levels on voltages at the 
Star bus with and without the Perry unit before 
the loss of the Harding-Chamberlin line at 15:05 
EDT. The top line shows that with the Perry plant 
available, local load could have increased by 625 
MW and voltage at Star would have remained 
above 95%. But the bottom line, simulating the 
loss of Perry, indicates that load could only have 
increased by about 150 MW before voltage at Star 
would have become unsolvable, indicating no 
voltage stability margin and depending on load 
dynamics, possible voltage collapse. 
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Inadequate 
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Figure 4.9. Loss of the Perry Unit Hurts Critical 
Voltages and Reactive Reserves: V-Q Analyses 



The above analyses indicate that the Cleveland- 
Akron area was highly vulnerable on the after¬ 
noon of August 14. Although the system was com¬ 
pliant with NERC Operating Policy 2 A. 1 for single 
contingency reliability before the loss of the Har¬ 
ding-Chamberlin line at 15:05 EDT, had FE lost 
the Perry plant its system would have neared volt¬ 
age instability or could have gone into a full volt¬ 
age collapse immediately if the Cleveland-Akron 
area load were 150 MW higher. It is worth noting 
that this could have happened on August 14—at 
13:43 EDT that afternoon, the Perry plant operator 
called the control area operator to warn about low 
voltages. At 15:36:51 EDT the Perry plant operator 
called FirstEnergy’s system control center to 
ask about voltage spikes at the plant’s main 
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transformer. 14 At 15:42:49 EDT the Perry operator 
called the FirstEnergy operator to say, “I’m still 
getting a lot of voltage spikes and swings on the 
generator .... I’m taking field volts pretty close to 
where I’ll trip the turbine off.” 15 

System Frequency 

Assuming stable conditions, the system frequency 
is the same across an interconnected grid at any 
particular moment. System frequency will vary 
from moment to moment, however, depending on 
the second-to-second balance between aggregate 
generation and aggregate demand across the inter¬ 
connection. System frequency is monitored on a 
continuous basis. 

There were no significant or unusual frequency 
oscillations in the Eastern Interconnection on 
August 14 prior to 16:09 EDT compared to prior 
days, and frequency was well within the bounds 
of safe operating practices. System frequency vari¬ 
ation was not a cause or precursor of the initiation 
of the blackout. But once the cascade began, the 
large frequency swings that occurred early on 
became a principal means by which the blackout 
spread across a wide area. 

Figure 4.11 shows Eastern Interconnection fre¬ 
quency on August 14, 2003. Frequency declines or 
increases from a mismatch between generation 
and load on the order of about 3,200 MW per 
0.1 Hertz (alternatively, a change in load or gener¬ 
ation of 1,000 MW would cause a frequency 


Frequency Management 

Each control area is responsible for maintaining 
a balance between its generation and demand. If 
persistent under-frequency occurs, at least one 
control area somewhere is “leaning on the grid,” 
meaning that it is taking unscheduled electric¬ 
ity from the grid, which both depresses system 
frequency and creates unscheduled power 
flows. In practice, minor deviations at the con¬ 
trol area level are routine; it is very difficult to 
maintain an exact balance between generation 
and demand. Accordingly, NERC has estab¬ 
lished operating rules that specify maximum 
permissible deviations, and focus on prohibit¬ 
ing persistent deviations, but not instantaneous 
ones. NERC monitors the performance of con¬ 
trol areas through specific measures of control 
performance that gauge how accurately each 
control area matches its load and generation. 


change of about ±0.031 Hz). Significant frequency 
excursions reflect large changes in load relative to 
generation and could cause unscheduled flows 
between control areas and even, in the extreme, 
cause automatic under-frequency load-shedding 
or automatic generator trips. 

The investigation team examined Eastern Inter¬ 
connection frequency and Area Control Error 
(ACE) for August 14, 2003 and the entire month of 
August, looking for patterns and anomalies. 
Extensive analysis using Fast Fourier Transforms 
(described in the NERC Technical Report) 
revealed no unusual variations. Rather, trans¬ 
forms using various time samples of average fre¬ 
quency (from 1 hour to 6 seconds in length) 
indicate instead that the Eastern Interconnection 
exhibits regular deviations. 16 

The largest deviations in frequency occur at regu¬ 
lar intervals. These intervals reflect interchange 


Figure 4.10. Impact of Perry Unit Outage on 
Cleveland-Akron Area Voltage Stability 



Figure 4.11. Frequency on August 14, 2003, 
up to 16:09 EDT 



Time - EDT 
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schedule changes at the peak to off-peak schedule 
changes (06:00 to 07:00 and 21:00 to 22:00, as 
shown in Figure 4.12) and on regular hourly and 
half-hour schedule changes as power plants ramp 
up and down to serve scheduled purchases and 
interchanges. Frequency tends to run high in the 
early part of the day because extra generation 
capacity is committed and waiting to be dis¬ 
patched for the afternoon peak, and then runs 
lower in the afternoon as load rises relative to 
available generation and spinning reserve. The 
investigation team concluded that frequency data 
collection and frequency management in the East¬ 
ern Interconnection should be improved, but that 
frequency oscillations before 16:09 EDT on 
August 14 had no effect on the blackout. 

Conclusion 

Determining that the system was in a reliable 
operational state at 15:05 EDT is extremely signifi¬ 
cant for understanding the causes of the blackout. 
It means that none of the electrical conditions on 
the system before 15:05 EDT was a cause of the 
blackout. This eliminates low voltages earlier in 
the day or on prior days, the unavailability of indi¬ 
vidual generators or transmission lines (either 
individually or in combination with one another), 
high power flows to Canada, unusual system fre¬ 
quencies, and many other issues as direct, princi¬ 
pal or sole causes of the blackout. 

Although FirstEnergy’s system was technically in 
secure electrical condition before 15:05 EDT, it 
was still highly vulnerable, because some of its 
assumptions and limits were not accurate for safe 
operating criteria. Analysis of Cleveland-Akron 
area voltages and reactive margins shows that 
FirstEnergy was operating that system on the very 
edge of NERC operational reliability standards, 
and that it could have been compromised by a 
number of potentially disruptive scenarios that 
were foreseeable by thorough planning and opera¬ 
tions studies. A system with this little reactive 
margin would leave little room for adjustment, 
with few relief actions available to operators in the 
face of single or multiple contingencies. As the 
next chapter will show, the vulnerability created 
by inadequate system planning and understand¬ 
ing was exacerbated because the FirstEnergy oper¬ 
ators were not adequately trained or prepared to 
recognize and deal with emergency situations. 


Figure 4.12. Hourly Deviations in Eastern 
Interconnection Frequency for the Month of 
August 2003 



Endnotes 

1 FE transcripts, Channel 14, 13:33:44. 

2 FE transcripts, Channel 14 at 13:21:05; channel 3 at 
13:41:54; 15:30:36. 

3 “ECAR Investigation of August 14, 2003 Blackout by Major 
System Disturbance Analysis Task Force, Recommendations 
Report,” page 6. 

4 Transmission operator at FE requested the restoration of 

the Avon Substation capacitor bank #2. Example at Channel 
3, 13:33:40. However, no additional capacitors were 

available. 

5 From 13:13 through 13:28, reliability operator at FE called 
nine plant operators to request additional voltage support. 
Examples at Channel 16, 13:13:18, 13:15:49, 13:16:44, 
13:20:44, 13:22:07, 13:23:24, 13:24:38, 13:26:04, 13:28:40. 

6 DOE/NERC fact-finding meeting, September 2003, state¬ 
ment by Mr. Steve Morgan (FE), PR0890803, lines 5-23. 

7 See 72 FERC 61,040, the order issued for FERC dockets EL 
94-75-000 and EL 94-80-000, for details of this incident. 

8 Testimony by Stanley Szwed, Vice President of Engi¬ 
neering and Planning, Centerior Service Company (Cleveland 
Electric Illuminating Company and Toledo Edison), FERC 
docket EL 94-75-000, February 22, 1996. 

9 Presentation notes for January 10, 2003 meeting between 
AEP and FirstEnergy, and meeting summary notes by Paul 
Johnson, AEP Manager, East Bulk Transmission Planning, 
January 10, 2003. 

10 “Talking Points” for May 21, 2003 meeting between AEP 
and FirstEnergy, prepared by AEP. 

11 Memo, “Summary of AEP/FE Meeting on 5/21/03,” by Scott 
P. Lockwood, AEP, May 29, 2003. 

12 Ibid . 

13 Testimony by Stanley Szwed, Vice President of Engi¬ 
neering and Planning, Centerior Service Company (Cleveland 
Electric Illuminating Company and Toledo Edison), FERC 
docket EL 94-75-000, February 22, 1996. 

14 FE transcript, Channel 8. 

15 FE transcript, Channel 8. 

16 See NERC Blackout Investigation Technical Reports, to be 
released in 2004. 
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5. How and Why the Blackout Began in Ohio 


Summary 

This chapter explains the major events—electri¬ 
cal, computer, and human—that occurred as the 
blackout evolved on August 14, 2003, and identi¬ 
fies the causes of the initiation of the blackout. 
The period covered in this chapter begins at 12:15 
Eastern Daylight Time (EDT) on August 14, 2003 
when inaccurate input data rendered MISO’s state 
estimator (a system monitoring tool) ineffective. 
At 13:31 EDT, FE’s Eastlake 5 generation unit trip¬ 
ped and shut down automatically. Shortly after 
14:14 EDT, the alarm and logging system in FE’s 
control room failed and was not restored until 
after the blackout. After 15:05 EDT, some of FE’s 
345-kV transmission lines began tripping out 
because the lines were contacting overgrown trees 
within the lines’ right-of-way areas. 

By around 15:46 EDT when FE, MISO and neigh¬ 
boring utilities had begun to realize that the FE 
system was in jeopardy, the only way that the 
blackout might have been averted would have 
been to drop at least 1,500 MW of load around 
Cleveland and Akron. No such effort was made, 
however, and by 15:46 EDT it may already have 
been too late for a large load-shed to make any dif¬ 
ference. After 15:46 EDT, the loss of some of FE’s 
key 345-kV lines in northern Ohio caused its 
underlying network of 138-kV lines to begin to 
fail, leading in turn to the loss of FE’s Sammis-Star 
345-kV line at 16:06 EDT. The chapter concludes 
with the loss of FE’s Sammis-Star line, the event 
that triggered the uncontrollable 345 kV cascade 
portion of the blackout sequence. 

The loss of the Sammis-Star line triggered the cas¬ 
cade because it shut down the 345-kV path into 
northern Ohio from eastern Ohio. Although the 
area around Akron, Ohio was already blacked out 
due to earlier events, most of northern Ohio 
remained interconnected and electricity demand 
was high. This meant that the loss of the heavily 
overloaded Sammis-Star line instantly created 
major and unsustainable burdens on lines in adja¬ 
cent areas, and the cascade spread rapidly as lines 


and generating units automatically tripped by pro¬ 
tective relay action to avoid physical damage. 

Chapter Organization 

This chapter is divided into several phases that 
correlate to major changes within the FirstEnergy 
system and the surrounding area in the hours 
leading up to the cascade: 

♦ Phase 1 : A normal afternoon degrades 

♦ Phase 2: FE’s computer failures 

♦ Phase 3: Three FE 345-kV transmission line fail¬ 
ures and many phone calls 

♦ Phase 4: The collapse of the FE 138-kV system 
and the loss of the Sammis-Star line. 

Key events within each phase are summarized in 
Figure 5.1, a timeline of major events in the origin 
of the blackout in Ohio. The discussion that fol¬ 
lows highlights and explains these significant 
events within each phase and explains how the 
events were related to one another and to the cas¬ 
cade. Specific causes of the blackout and associ¬ 
ated recommendations are identified by icons. 

Phase 1: 

A Normal Afternoon Degrades: 
12:15 EDT to 14:14 EDT 

Overview of This Phase 

Northern Ohio was experiencing an ordinary 
August afternoon, with loads moderately high to 
serve air conditioning demand, consuming high 
levels of reactive power. With two of Cleveland’s 
active and reactive power production anchors 
already shut down (Davis-Besse and Eastlake 4), 
the loss of the Eastlake 5 unit at 13:31 EDT further 
depleted critical voltage support for the Cleve- 
land-Akron area. Detailed simulation modeling 
reveals that the loss of Eastlake 5 was a significant 
factor in the outage later that afternoon— 
with Eastlake 5 out of service, transmission line 
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Figure 5.1. Timeline: Start of the Blackout in Ohio 
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loadings were notably higher but well within nor¬ 
mal ratings. After the loss of FE’s Har- 
ding-Chamberlin line at 15:05 EDT, the system 
eventually became unable to sustain additional 
contingencies, even though key 345 kV line load¬ 
ings did not exceed their normal ratings. Had 
Eastlake 5 remained in service, subsequent line 
loadings would have been lower. Loss of Eastlake 
5, however, did not initiate the blackout. Rather, 
subsequent computer failures leading to the loss 
of situational awareness in FE’s control room and 
the loss of key FE transmission lines due to con¬ 
tacts with trees were the most important causes. 

At 14:02 EDT, Dayton Power & Light’s (DPL) Stu¬ 
art-Atlanta 345-kV line tripped off-line due to a 
tree contact. This line had no direct electrical 
effect on FE’s system—but it did affect MISO’s per¬ 
formance as reliability coordinator, even though 
PJM is the reliability coordinator for the DPL line. 
One of MISO’s primary system condition evalua¬ 
tion tools, its state estimator, was unable to assess 
system conditions for most of the period between 


12:15 and 15:34 EDT, due to a combination of 
human error and the effect of the loss of DPL’s Stu¬ 
art-Atlanta line on other MISO lines as reflected in 
the state estimator’s calculations. Without an 
effective state estimator, MISO was unable to per¬ 
form contingency analyses of generation and line 
losses within its reliability zone. Therefore, 
through 15:34 EDT MISO could not determine 
that with Eastlake 5 down, other transmission 
lines would overload if FE lost a major transmis¬ 
sion line, and could not issue appropriate warn¬ 
ings and operational instructions. 

In the investigation interviews, all utilities, con¬ 
trol area operators, and reliability coordinators 
indicated that the morning of August 14 was a rea¬ 
sonably typical day . 1 FE managers referred to it as 
peak load conditions on a less than peak load day. 
Dispatchers consistently said that while voltages 
were low, they were consistent with historical 
voltages . 2 Throughout the morning and early 
afternoon of August 14, FE reported a growing 
need for voltage support in the upper Midwest. 
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The FE reliability operator was concerned about 
low voltage conditions on the FE system as early 
as 13:13 EDT. He asked for voltage support (i.e., 
increased reactive power output) from FE’s inter¬ 
connected generators. Plants were operating in 
automatic voltage control mode (reacting to sys¬ 
tem voltage conditions and needs rather than con¬ 
stant reactive power output). As directed in FE’s 
Manual of Operations, 3 the FE reliability operator 
began to call plant operators to ask for additional 
voltage support from their units. He noted to most 
of them that system voltages were sagging “all 
over.” Several mentioned that they were already at 
or near their reactive output limits. None were 


asked to reduce their real power output to be able 
to produce more reactive output. He called the 
Sammis plant at 13:13 EDT, West Lorain at 13:15 
EDT, Eastlake at 13:16 EDT, made three calls to 
unidentified plants between 13:20 EDT and 13:23 
EDT, a “Unit 9” at 13:24 EDT, and two more at 
13:26 EDT and 13:28 EDT. 4 The operators worked 
to get shunt capacitors at Avon that were out of 
service restored to support voltage, 5 but those 
capacitors could not be restored to service. 

Following the loss of Eastlake 5 at 13:31 EDT, FE’s 
operators’ concern about voltage levels increased. 
They called Bay Shore at 13:41 EDT and Perry at 


Energy Management System (EMS) and Decision Support Tools 


Operators look at potential problems that could 
arise on their systems by using contingency anal¬ 
yses, driven from state estimation, that are fed by 
data collected by the SCADA system. 

SCADA: System operators use System Control 
and Data Acquisition systems to acquire power 
system data and control power system equip¬ 
ment. SCADA systems have three types of ele¬ 
ments: field remote terminal units (RTUs), 
communication to and between the RTUs, and 
one or more Master Stations. 

Field RTUs, installed at generation plants and 
substations, are combination data gathering and 
device control units. They gather and provide 
information of interest to system operators, such 
as the status of a breaker (switch), the voltage on 
a line or the amount of real and reactive power 
being produced by a generator, and execute con¬ 
trol operations such as opening or closing a 
breaker. Telecommunications facilities, such as 
telephone lines or microwave radio channels, are 
provided for the field RTUs so they can commu¬ 
nicate with one or more SCADA Master Stations 
or, less commonly, with each other. 

Master stations are the pieces of the SCADA sys¬ 
tem that initiate a cycle of data gathering from the 
field RTUs over the communications facilities, 
with time cycles ranging from every few seconds 
to as long as several minutes. In many power sys¬ 
tems, Master Stations are fully integrated into the 
control room, serving as the direct interface to 
the Energy Management System (EMS), receiving 
incoming data from the field RTUs and relaying 
control operations commands to the field devices 
for execution. 

State Estimation: Transmission system operators 
must have visibility (condition information) over 


their own transmission facilities, and recognize 
the impact on their own systems of events and 
facilities in neighboring systems. To accomplish 
this, system state estimators use the real-time 
data measurements available on a subset of those 
facilities in a complex mathematical model of the 
power system that reflects the configuration of 
the network (which facilities are in service and 
which are not) and real-time system condition 
data to estimate voltage at each bus, and to esti¬ 
mate real and reactive power flow quantities on 
each line or through each transformer. Reliability 
coordinators and control areas that have them 
commonly run a state estimator on regular inter¬ 
vals or only as the need arises (i.e., upon 
demand). Not all control areas use state 
estimators. 

Contingency Analysis: Given the state estima¬ 
tor’s representation of current system conditions, 
a system operator or planner uses contingency 
analysis to analyze the impact of specific outages 
(lines, generators, or other equipment) or higher 
load, flow, or generation levels on the security of 
the system. The contingency analysis should 
identify problems such as line overloads or volt¬ 
age violations that will occur if a new event (con¬ 
tingency) happens on the system. Some 
transmission operators and control areas have 
and use state estimators to produce base cases 
from which to analyze next contingencies (“N-l,” 
meaning normal system minus 1 key element) 
from the current conditions. This tool is typically 
used to assess the reliability of system operation. 
Many control areas do not use real time contin¬ 
gency analysis tools, but others run them on 
demand following potentially significant system 
events. 
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Figure 5.2. Timeline Phase 1 
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13:43 EDT to ask the plants for more voltage sup¬ 
port. Again, while there was substantial effort to 
support voltages in the Ohio area, FirstEnergy per¬ 
sonnel characterized the conditions as not being 
unusual for a peak load day, although this was not 
an all-time (or record) peak load day. 6 

Key Phase 1 Events 

lA) 12:15 EDT to 16:04 EDT: MISO’s state estima¬ 
tor software solution was compromised, and 
MISO’s single contingency reliability assess¬ 
ment became unavailable. 

IB) 13:31:34 EDT: Eastlake Unit 5 generation trip¬ 
ped in northern Ohio. 

IC) 14:02 EDT: Stuart-Atlanta 345-kV transmis¬ 
sion line tripped in southern Ohio. 

1A) MISO’s State Estimator Was Turned Off: 
12:15 EDT to 16:04 EDT 

It is common for reliability coordinators and con¬ 
trol areas to use a state estimator (SE) to improve 
the accuracy of the raw sampled data they have for 
the electric system by mathematically processing 
raw data to make it consistent with the electrical 
system model. The resulting information on 
equipment voltages and loadings is used in soft¬ 
ware tools such as real time contingency analysis 
(RTCA) to simulate various conditions and out¬ 
ages to evaluate the reliability of the power sys¬ 
tem. The RTCA tool is used to alert operators if the 
system is operating insecurely; it can be run either 
on a regular schedule (e.g., every 5 minutes), when 
triggered by some system event (e.g., the loss of a 
power plant or transmission line), or when initi¬ 
ated by an operator. MISO usually runs the SE 


every 5 minutes, and the RTCA less frequently. If 
the model does not have accurate and timely infor¬ 
mation about key pieces of system equipment or if 
key input data are wrong, the state estimator may 
be unable to reach a solution or it will reach a solu¬ 
tion that is labeled as having a high degree of error. 
In August, MISO considered its SE and RTCA 
tools to be still under development and not fully 
mature; those systems have since been completed 
and placed into full operation. 

On August 14 at about 12:15 EDT, MISO’s state 
estimator produced a solution with a high mis¬ 
match (outside the bounds of acceptable error). 
This was traced to an outage of Cinergy’s 
Bloomington-Denois Creek 230-kV line— 
although it was out of service, its status was not 
updated in MISO’s state estimator. Line status 
information within MISO’s reliability coordina¬ 
tion area is transmitted to MISO by the ECAR data 
network or direct links and is intended to be auto¬ 
matically linked to the SE. This requires coordi¬ 
nated data naming as well as instructions that link 
the data to the tools. For this line, the automatic 
linkage of line status to the state estimator had not 
yet been established. The line status was corrected 
and MISO’s analyst obtained a good SE solution at 
13:00 EDT and an RTCA solution at 13:07 EDT. 
However, to troubleshoot this problem the analyst 
had turned off the automatic trigger that runs the 
state estimator every five minutes. After fixing the 
problem he forgot to re-enable it, so although he 
had successfully run the SE and RTCA manually 
to reach a set of correct system analyses, the tools 
were not returned to normal automatic operation. 
Thinking the system had been successfully 
restored, the analyst went to lunch. 
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The fact that the state estimator 
was not running automatically on 
its regular 5-minute schedule was 
discovered about 14:40 EDT. The 
automatic trigger was re-enabled 
but again the state estimator failed to solve suc¬ 
cessfully. This time investigation identified the 
Stuart-Atlanta 345-kV line outage (which 
occurred at 14:02 EDT) to be the likely cause. This 
line is within the Dayton Power and Light control 
area in southern Ohio and is under PJM’s reliabil¬ 
ity umbrella rather than MISO’s. Even though it 
affects electrical flows within MISO, its status had 
not been automatically linked to MISO’s state 
estimator. 

The discrepancy between actual measured system 
flows (with Stuart-Atlanta off-line) and the MISO 
model (which assumed Stuart-Atlanta on-line) 
prevented the state estimator from solving cor¬ 
rectly. At 15:09 EDT, when informed by the sys¬ 
tem engineer that the Stuart-Atlanta line appeared 
to be the problem, the MISO operator said (mistak¬ 
enly) that this line was in service. The system 
engineer then tried unsuccessfully to reach a solu¬ 
tion with the Stuart-Atlanta line modeled as in 
service until approximately 15:29 EDT, when the 
MISO operator called PJM to verify the correct sta¬ 
tus. After they determined that Stuart-Atlanta had 
tripped, they updated the state estimator and it 
solved successfully. The RTCA was then run man¬ 
ually and solved successfully at 15:41 EDT. 
MISO’s state estimator and contingency analysis 
were back under full automatic operation and 
solving effectively by 16:04 EDT, about two min¬ 
utes before the start of the cascade. 


Cause 4 


Inadequate 
RC Diagnostic 
Support 


In summary, the MISO state estimator and real 
time contingency analysis tools were effectively 
out of service between 12:15 EDT and 16:04 EDT. 
This prevented MISO from promptly performing 
precontingency “early warning” assessments of 
power system reliability 
over the afternoon of August 
14. 


Recommendations 


3, page 143; 6, page 147; 
30, page 163 


IB) Eastlake Unit 5 Tripped: 13:31 EDT 

Eastlake Unit 5 (rated at 597 MW) is in northern 
Ohio along the southern shore of Lake Erie, con¬ 
nected to FE’s 345-kV transmission system (Figure 
5.3). The Cleveland and Akron loads are generally 
supported by generation from a combination of 
the Eastlake, Perry and Davis-Besse units, along 
with significant imports, particularly from 
9,100 MW of generation located along the Ohio 
and Pennsylvania border. The unavailability of 


Eastlake 4 and Davis-Besse meant that FE had to 
import more energy into the Cleveland-Akron area 
to support its load. 

When Eastlake 5 dropped off-line, replacement 
power transfers and the associated reactive power 
to support the imports to the local area contrib¬ 
uted to the additional line loadings in the region. 
At 15:00 EDT on August 14, FE’s load was approxi¬ 
mately 12,080 MW, and they were importing 
about 2,575 MW, 21% of their total. FE’s system 
reactive power needs rose further. 


Cause 1 


Inadequate 
System 
Understanding 


The investigation team’s system 
simulations indicate that the loss 
of Eastlake 5 was a critical step in 
the sequence of events. Contin¬ 
gency analysis simulation of the 
conditions following the loss of the Har- 
ding-Chamberlin 345-kV circuit at 15:05 EDT 
showed that the system would be unable to sus¬ 
tain some contingencies without line overloads 
above emergency ratings. However, when Eastlake 
5 was modeled as in service and fully available in 
those simulations, all overloads above emergency 
limits were eliminated, even 
with the loss of Harding- 
Chamberlin. 


Recommendation 


23, page 160 


FE did not perform a contingency 
analysis after the loss of Eastlake 
5 at 13:31 EDT to determine 
whether the loss of further lines 
or plants would put their system 
at risk. FE also did not perform a contingency anal¬ 
ysis after the loss of Harding-Chamberlin at 15:05 
EDT (in part because they did not know that it had 
tripped out of service), nor does the utility rou¬ 
tinely conduct such studies. 7 Thus FE did not dis¬ 
cover that their system was no longer in an N-l 


Cause 2 


Inadequate 

Situational 

Awareness 


Figure 5.3. Eastlake Unit 5 
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Recommendations 


secure state at 15:05 EDT, 
and that operator action was 
needed to remedy the 
situation. 


3, page 143, 22, page 159 


1C) Stuart-Atlanta 345-kV Line Tripped: 

14:02 EDT 

The Stuart-Atlanta 345-kV trans¬ 
mission line is in the control area 
of Dayton Power and Light. At 
14:02 EDT the line tripped due to 
contact with a tree, causing a 
short circuit to ground, and locked out. Investiga¬ 
tion team modeling reveals that the loss of DPL’s 
Stuart-Atlanta line had no significant electrical 


Cause 1 


Inadequate 

System 

Understanding 


effect on power flows and voltages in the FE area. 
The team examined the security of FE’s system, 
testing power flows and voltage levels with the 
combination of plant and line outages that evolved 
on the afternoon of August 14. This analysis 
shows that the availability or unavailability of the 
Stuart-Atlanta 345-kV line did not change the 
capability or performance of FE’s system or affect 
any line loadings within the FE system, either 
immediately after its trip or later that afternoon. 
The only reason why Stuart-Atlanta matters to the 
blackout is because it contributed to the failure of 
MISO’s state estimator to operate effectively, so 
MISO could not fully identify FE’s precarious sys¬ 
tem conditions until 16:04 EDT. 8 


Data Exchanged for Operational Reliability 

The topology of the electric system is essentially 
the road map of the grid. It is determined by how 
each generating unit and substation is connected 
to all other facilities in the system and at what 
voltage levels, the size of the individual transmis¬ 
sion wires, the electrical characteristics of each 
of those connections, and where and when series 
and shunt reactive devices are in service. All of 
these elements affect the system’s imped¬ 
ance—the physics of how and where power will 
flow across the system. Topology and impedance 
are modeled in power-flow programs, state esti¬ 
mators, and contingency analysis software used 
to evaluate and manage the system. 

Topology processors are used as front-end pro¬ 
cessors for state estimators and operational dis¬ 
play and alarm systems. They convert the digital 
telemetry of breaker and switch status to be used 
by state estimators, and for displays showing 
lines being opened or closed or reactive devices 
in or out of service. 

A variety of up-to-date information on the ele¬ 
ments of the system must be collected and 
exchanged for modeled topology to be accurate 
in real time. If data on the condition of system 
elements are incorrect, a state estimator will not 
successfully solve or converge because the 
real-world line flows and voltages being reported 
will disagree with the modeled solution. 

Data Needed: A variety of operational data is col¬ 
lected and exchanged between control areas and 
reliability coordinators to monitor system perfor¬ 
mance, conduct reliability analyses, manage con¬ 
gestion, and perform energy accounting. The 


data exchanged range from real-time system 
data, which is exchanged every 2 to 4 seconds, to 
OASIS reservations and electronic tags that iden¬ 
tify individual energy transactions between par¬ 
ties. Much of these data are collected through 
operators’ SCADA systems. 

ICCP: Real-time operational data is exchanged 
and shared as rapidly as it is collected. The data 
is passed between the control centers using an 
Inter-Control Center Communications Protocol 
(ICCP), often over private frame relay networks. 
NERC operates one such network, known as 
NERCNet. ICCP data are used for minute-to- 
minute operations to monitor system conditions 
and control the system, and include items such 
as line flows, voltages, generation levels, dy¬ 
namic interchange schedules, area control error 
(ACE), and system frequency, as well as in state 
estimators and contingency analysis tools. 

IDC: Since the actual power flows along the path 
of least resistance in accordance with the laws of 
physics, the NERC Interchange Distribution Cal¬ 
culator (IDC) is used to determine where it will 
actually flow. The IDC is a computer software 
package that calculates the impacts of existing or 
proposed power transfers on the transmission 
components of the Eastern Interconnection. The 
IDC uses a power flow model of the interconnec¬ 
tion, representing over 40,000 substation buses, 
55,000 lines and transformers, and more than 
6,000 generators. This model calculates transfer 
distribution factors (TDFs), which tell how a 
power transfer would load up each system 

(continued on page 51) 
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Phase 2: 

FE’s Computer Failures: 

14:14 EDT to 15:59 EDT 

Overview of This Phase 

Starting around 14:14 EDT, FE’s control room 
operators lost the alarm function that provided 
audible and visual indications when a significant 
piece of equipment changed from an acceptable to 
a problematic condition. Shortly thereafter, the 
EMS system lost a number of its remote control 
consoles. Next it lost the primary server computer 


that was hosting the alarm function, and then the 
backup server such that all functions that were 
being supported on these servers were stopped at 
14:54 EDT. However, for over an hour no one in 
FE’s control room grasped that their computer sys¬ 
tems were not operating properly, even though 
FE’s Information Technology support staff knew 
of the problems and were working to solve them, 
and the absence of alarms and other symptoms 
offered many clues to the operators of the EMS 
system’s impaired state. Thus, without a function¬ 
ing EMS or the knowledge that it had failed, FE’s 
system operators remained unaware that their 
electrical system condition was beginning to 


Data Exchanged for Operational Reliability (Continued) 


element, and outage transfer distribution factors 
(OTDFs), which tell how much power would be 
transferred to a system element if another spe¬ 
cific system element were lost. 

The IDC model is updated through the NERC 
System Data Exchange (SDX) system to reflect 
line outages, load levels, and generation outages. 
Power transfer information is input to the IDC 
through the NERC electronic tagging (E-Tag) 
system. 

SDX: The IDC depends on element status infor¬ 
mation, exchanged over the NERC System Data 
Exchange (SDX) system, to keep the system 
topology current in its powerflow model of the 
Eastern Interconnection. The SDX distributes 
generation and transmission outage information 
to all operators, as well as demand and operating 
reserve projections for the next 48 hours. These 
data are used to update the IDC model, which is 
used to calculate the impact of power transfers 
across the system on individual transmission 
system elements. There is no current require¬ 
ment for how quickly asset owners must report 
changes in element status (such as a line outage) 
to the SDX—some entities update it with facility 
status only once a day, while others submit new 
information immediately after an event occurs. 
NERC is now developing a requirement for regu¬ 
lar information update submittals that is sched¬ 
uled to take effect in the summer of 2004. 

SDX data are used by some control centers to 
keep their topology up-to-date for areas of the 
interconnection that are not observable through 
direct telemetry or ICCP data. A number of trans¬ 
mission providers also use these data to update 
their transmission models for short-term 


determination of available transmission capabil¬ 
ity (ATC). 

E-Tags: All inter-control area power transfers are 
electronically tagged (E-Tag) with critical infor¬ 
mation for use in reliability coordination and 
congestion management systems, particularly 
the IDC in the Eastern Interconnection. The 
Western Interconnection also exchanges tagging 
information for reliability coordination and use 
in its unscheduled flow mitigation system. An 
E-Tag includes information about the size of the 
transfer, when it starts and stops, where it starts 
and ends, and the transmission service providers 
along its entire contract path, the priorities of the 
transmission service being used, and other 
pertinent details of the transaction. More than 
100,000 E-Tags are exchanged every month, 
representing about 100,000 GWh of transactions. 
The information in the E-Tags is used to facili¬ 
tate curtailments as needed for congestion 
management. 

Voice Comm u nications: Voice communication 
between control area operators and reliability is 
an essential part of exchanging operational data. 
When telemetry or electronic communications 
fail, some essential data values have to be manu¬ 
ally entered into SCADA systems, state estima¬ 
tors, energy scheduling and accounting software, 
and contingency analysis systems. Direct voice 
contact between operators enables them to 
replace key data with readings from the other 
systems’ telemetry, or surmise what an appropri¬ 
ate value for manual replacement should be. 
Also, when operators see spurious readings or 
suspicious flows, direct discussions with neigh¬ 
boring control centers can help avert problems 
like those experienced on August 14, 2003. 
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Figure 5.4. Timeline Phase 2 
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degrade. Unknowingly, they used the outdated 
system condition information they did have to dis¬ 
count information from others about growing sys¬ 
tem problems. 

Key Events in This Phase 

2A) 14:14 EDT: FE alarm and logging software 
failed. Neither FE’s control room operators 
nor FE’s IT EMS support personnel were 
aware of the alarm failure. 

2B) 14:20 EDT: Several FE remote EMS consoles 
failed. FE’s Information Technology (IT) engi¬ 
neer was computer auto-paged. 

2C) 14:27:16 EDT: Star-South Canton 345-kV 
transmission line tripped and successfully 
reclosed. 

2D) 14:32 EDT: AEP called FE control room about 
AEP indication of Star-South Canton 345-kV 
line trip and reclosure. FE had no alarm or log 
of this line trip. 

2E) 14:41 EDT: The primary FE control system 
server hosting the alarm function failed. Its 
applications and functions were passed over 
to a backup computer. FE’s IT engineer was 
auto-paged. 

2F) 14:54 EDT: The FE back-up computer failed 
and all functions that were running on it 
stopped. FE’s IT engineer was auto-paged. 

Failure of FE’s Alarm System 

FE’s computer SCADA alarm and 
logging software failed sometime 
shortly after 14:14 EDT (the last 
time that a valid alarm came in), 


after voltages had begun deteriorating but well 
before any of FE’s lines began to contact trees and 
trip out. After that time, the FE control room con¬ 
soles did not receive any further alarms, nor were 
there any alarms being printed or posted on the 
EMS’s alarm logging facilities. Power system oper¬ 
ators rely heavily on audible and on-screen 
alarms, plus alarm logs, to reveal any significant 
changes in their system’s conditions. After 14:14 
EDT on August 14, FE’s operators were working 
under a significant handicap without these tools. 
However, they were in further jeopardy because 
they did not know that they were operating with¬ 
out alarms, so that they did not realize that system 
conditions were changing. 

Alarms are a critical function of an EMS, and 
EMS-generated alarms are the fundamental means 
by which system operators identify events on the 
power system that need their attention. Without 
alarms, events indicating one or more significant 
system changes can occur but remain undetected 
by the operator. If an EMS’s alarms are absent, but 
operators are aware of the situation and the 
remainder of the EMS’s functions are intact, the 
operators can potentially continue to use the EMS 
to monitor and exercise control of their power sys¬ 
tem. In such circumstances, the operators would 
have to do so via repetitive, continuous manual 
scanning of numerous data and status points 
located within the multitude of individual dis¬ 
plays available within their EMS. Further, it 
would be difficult for the operator to identify 
quickly the most relevant of the many screens 
available. 

In the same way that an alarm system can inform 
operators about the failure of key grid facilities, it 


Cause 2 


Inadequate 

Situational 

Awareness 
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can also be set up to alarm them if the alarm sys¬ 
tem itself fails to perform properly. FE’s EMS did 
not have such a notification system. 

Although the alarm processing function of FE’s 
EMS failed, the remainder of that system generally 
continued to collect valid real-time status infor¬ 
mation and measurements about FE’s power sys¬ 
tem, and continued to have supervisory control 
over the FE system. The EMS also continued to 
send its normal and expected collection of infor¬ 
mation on to other monitoring points and authori¬ 
ties, including MISO and AEP. Thus these entities 
continued to receive accurate information about 
the status and condition of FE’s power system after 
the time when FE’s EMS alarms failed. FE’s opera¬ 
tors were unaware that in this situation they 
needed to manually and more closely monitor and 
interpret the SCADA information they were 


receiving. Continuing on in the belief that their 
system was satisfactory, lacking any alarms from 
their EMS to the contrary, and without visualiza¬ 
tion aids such as a dynamic map board or a projec¬ 
tion of system topology, FE control room operators 
were subsequently surprised when they began 
receiving telephone calls from other locations and 
information sources—MISO, AEP, PJM, and FE 
field operations staff—who offered information on 
the status of FE’s transmission facilities that con¬ 
flicted with FE’s system 


Recommendations 


operators’ understanding of 3, page 143 , 22 , page 159 1 
the situation. 


Analysis of the alarm problem performed by FE 
suggests that the alarm process essentially 
“stalled” while processing an alarm event, such 
that the process began to run in a manner that 
failed to complete the processing of that alarm or 


Alarms 

System operators must keep a close and constant 
watch on the multitude of things occurring 
simultaneously on their power system. These 
include the system’s load, the generation and 
supply resources to meet that load, available 
reserves, and measurements of critical power 
system states, such as the voltage levels on the 
lines. Because it is not humanly possible to 
watch and understand all these events and con¬ 
ditions simultaneously, Energy Management 
Systems use alarms to bring relevant information 
to operators’ attention. The alarms draw on the 
information collected by the SCADA real-time 
monitoring system. 

Alarms are designed to quickly and appropri¬ 
ately attract the power system operators’ atten¬ 
tion to events or developments of interest on the 
system. They do so using combinations of audi¬ 
ble and visual signals, such as sounds at opera¬ 
tors’ control desks and symbol or color changes 
or animations on system monitors, displays, or 
map boards. EMS alarms for power systems are 
similar to the indicator lights or warning bell 
tones that a modern automobile uses to signal its 
driver, like the “door open” bell, an image of a 
headlight high beam, a “parking brake on” indi¬ 
cator, and the visual and audible alert when a gas 
tank is almost empty. 

Power systems, like cars, use “status” alarms and 
“limit” alarms. A status alarm indicates the state 
of a monitored device. In power systems these 
are commonly used to indicate whether such 
items as switches or breakers are “open” or 


“closed” (off or on) when they should be other¬ 
wise, or whether they have changed condition 
since the last scan. These alarms should provide 
clear indication and notification to system opera¬ 
tors of whether a given device is doing what they 
think it is, or what they want it to do—for 
instance, whether a given power line is con¬ 
nected to the system and moving power at a par¬ 
ticular moment. 

EMS limit alarms are designed to provide an 
indication to system operators when something 
important that is measured on a power system 
device—such as the voltage on a line or the 
amount of power flowing across it—is below or 
above pre-specified limits for using that device 
safely and efficiently. When a limit alarm acti¬ 
vates, it provides an important early warning to 
the power system operator that elements of the 
system may need some adjustment to prevent 
damage to the system or to customer loads—like 
the “low fuel” or “high engine temperature” 
warnings in a car. 

When FE’s alarm system failed on August 14, its 
operators were running a complex power system 
without adequate indicators of when key ele¬ 
ments of that system were reaching and passing 
the limits of safe operation—and without aware¬ 
ness that they were running the system without 
these alarms and should no longer assume that 
not getting alarms meant that system conditions 
were still safe and unchanging. 
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produce any other valid output (alarms). In the 
meantime, new inputs—system condition data 
that needed to be reviewed for possible 
alarms—built up in and then overflowed the pro¬ 
cess’ input buffers. 9 10 

Loss of Remote EMS Terminals. Between 14:20 
EDT and 14:25 EDT, some of FE’s remote EMS ter¬ 
minals in substations ceased operation. FE has 
advised the investigation team that it believes this 
occurred because the data feeding into those ter¬ 
minals started “queuing” and overloading the ter¬ 
minals’ buffers. FE’s system operators did not 
learn about this failure until 14:36 EDT, when a 
technician at one of the sites noticed the terminal 
was not working after he came in on the 15:00 
shift, and called the main control room to report 
the problem. As remote terminals failed, each trig¬ 
gered an automatic page to FE’s Information Tech¬ 
nology (IT) staff. 11 The investigation team has not 
determined why some terminals failed whereas 
others did not. Transcripts indicate that data links 
to the remote sites were down as well. 12 


EMS Server Failures. FE’s EMS system includes 
several server nodes that perform the higher func¬ 
tions of the EMS. Although any one of them can 
host all of the functions, FE’s normal system con¬ 
figuration is to have a number of host subsets of 
the applications, with one server remaining in a 
“hot-standby” mode as a backup to the others 
should any fail. At 14:41 EDT, the primary server 
hosting the EMS alarm processing application 
failed, due either to the stalling of the alarm appli¬ 
cation, “queuing” to the remote EMS terminals, 
or some combination of the two. Following pre¬ 
programmed instructions, the alarm system appli¬ 
cation and all other EMS software running on the 
first server automatically transferred (“failed- 
over”) onto the back-up server. However, because 
the alarm application moved intact onto the 
backup while still stalled and ineffective, the 
backup server failed 13 minutes later, at 14:54 
EDT. Accordingly, all of the EMS applications on 
these two servers stopped 


Recommendation 


running. 22 , page 159 


The concurrent loss of both EMS 
servers apparently caused several 
new problems for FE’s EMS and 
the operators who used it. Tests 
run during FE’s after-the-fact 
analysis of the alarm failure event indicate that a 
concurrent absence of these servers can signifi¬ 
cantly slow down the rate at which the EMS sys¬ 
tem puts new—or refreshes existing—displays on 


Cause 2 


Inadequate 

Situational 

Awareness 


operators’ computer consoles. Thus at times on 
August 14th, operators’ screen refresh rates—the 
rate at which new information and displays are 
painted onto the computer screen, normally 1 to 3 
seconds—slowed to as long as 59 seconds per 
screen. Since FE operators have numerous infor¬ 
mation screen options, and one or more screens 
are commonly “nested” as sub-screens to one or 
more top level screens, operators’ ability to view, 
understand and operate their system through the 
EMS would have slowed to a frustrating crawl. 13 
This situation may have occurred between 14:54 
EDT and 15:08 EDT when both servers failed, and 
again between 15:46 EDT and 15:59 EDT while 
FE’s IT personnel attempted to reboot both servers 
to remedy the alarm problem. 


Loss of the first server caused an auto-page to be 
issued to alert FE’s EMS IT support personnel to 
the problem. When the back-up server failed, it 
too sent an auto-page to FE’s IT staff. They did not 
notify control room operators of the problem. At 
15:08 EDT, IT staffers completed a “warm reboot” 
(restart) of the primary server. Startup diagnostics 
monitored during that reboot verified that the 
computer and all expected processes were run¬ 
ning; accordingly, FE’s IT staff believed that they 
had successfully restarted the node and all the 
processes it was hosting. However, although the 
server and its applications were again running, the 
alarm system remained frozen and non-func¬ 
tional, even on the restarted computer. The IT staff 

did not confirm that the _ 

alarm system was again 
working properly with the 
control room operators. 


Recommendation 


19, page 156 


Another casualty of the loss of both servers was 
the Automatic Generation Control (AGC) function 
hosted on those computers. Loss of AGC meant 
that FE’s operators could not run affiliated 
power plants on pre-set programs to respond auto¬ 
matically to meet FE’s system load and inter¬ 
change obligations. Although the AGC did not 
work from 14:54 EDT to 15:08 EDT and 15:46 EDT 
to 15:59 EDT (periods when both servers were 
down), this loss of function 
does not appear to have had 
an effect on the blackout. 


Recommendation 


22, page 159 


The concurrent loss of the EMS 
servers also caused the failure of 
FE’s strip chart function. There 
are many strip charts in the FE 
Reliability Operator control room 
driven by the EMS computers, showing a variety 


Cause 2 


Inadequate 

Situational 

Awareness 
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of system conditions, including raw ACE (Area 
Control Error), FE system load, and Sammis-South 
Canton and South Canton-Star loading. These 
charts are visible in the reliability operator control 
room. The chart printers continued to scroll but 
because the underlying computer system was 
locked up the chart pens showed only the last 
valid measurement recorded, without any varia¬ 
tion from that measurement as time progressed 
(i.e., the charts “flat-lined”). There is no indication 
that any operators noticed or reported the failed 
operation of the charts. 14 The few charts fed by 
direct analog telemetry, rather than the EMS sys¬ 
tem, showed primarily frequency data, and 
remained available throughout the afternoon of 
August 14. These yield little useful system infor¬ 
mation for operational purposes. 

FE’s Area Control Error (ACE), the primary control 
signal used to adjust generators and imports to 
match load obligations, did not function between 
14:54 EDT and 15:08 EDT and later between 15:46 


EDT and 15:59 EDT, when the two servers were 
down. This meant that generators were not con¬ 
trolled during these periods to meet FE’s load and 
interchange obligations (except from 15:00 EDT to 
15:09 EDT when control was switched to a backup 
controller). There were no apparent negative con¬ 
sequences from this failure. It has not been estab¬ 
lished how loss of the primary generation control 
signal was identified or if any discussions 
occurred with respect to the computer system’s 
operational status. 15 

EMS System History. The EMS in service at FE’s 
Ohio control center is a GE Harris (now GE Net¬ 
work Systems) XA21 system. It was initially 
brought into service in 1995. Other than the appli¬ 
cation of minor software fixes or patches typically 
encountered in the ongoing maintenance and sup¬ 
port of such a system, the last major updates or 
revisions to this EMS were implemented in 1998. 
On August 14 the system was not running the 
most current release of the XA21 software. FE had 


Who Saw What? 

What data and tools did others have to monitor 
the conditions on the FE system? 

Midwest ISO (MISO), reliability coordinator for 
FE 

Alarms: MISO received indications of breaker 
trips in FE that registered in MISO’s alarms; 
however, the alarms were missed. These alarms 
require a look-up to link the flagged breaker with 
the associated line or equipment and unless this 
line was specifically monitored, require another 
look-up to link the line to the monitored 
flowgate. MISO operators did not have the capa¬ 
bility to click on the on-screen alarm indicator to 
display the underlying information. 

Real Time Contingency Analysis (RTCA): The 

contingency analysis showed several hundred 
violations around 15:00 EDT. This included 
some FE violations, which MISO (FE’s reliability 
coordinator) operators discussed with PJM 
(AEP’s Reliability Coordinator). 3 Simulations 
developed for this investigation show that viola¬ 
tions for a contingency would have occurred 
after the Harding-Chamberlin trip at 15:05 EDT. 
There is no indication that MISO addressed this 
issue. It is not known whether MISO identified 
the developing Sammis-Star problem. 


Flowgate Monitoring Tool: While an inaccuracy 
has been identified with regard to this tool it still 
functioned with reasonable accuracy and 
prompted MISO to call FE to discuss the 
Hanna-Juniper line problem. It would not have 
identified problems south of Star since that was 
not part of the flowgate and thus not modeled in 
MISO’s flowgate monitor. 

AEP 

Contingency Analysis: According to interviews, 5 
AEP had contingency analysis that covered lines 
into Star. The AEP operator identified a problem 
for Star-South Canton overloads for a Sammis- 
Star line loss about 15:33 EDT and asked PJM to 
develop TLRs for this. However, due to the size of 
the requested TLR, this was not implemented 
before the line tripped out of service. 

Alarms: Since a number of lines cross between 
AEP’s and FE’s systems, they had the ability at 
their respective end of each line to identify con¬ 
tingencies that would affect both. AEP initially 
noticed FE line problems with the first and sub¬ 
sequent trips of the Star-South Canton 345-kV 
line, and called FE three times between 14:35 
EDT and 15:45 EDT to determine whether FE 
knew the cause of the outages 


a “MISO Site Visit,” Benbow interview. 
b “AEP Site Visit,” Ulrich interview. 

c Example at 14:35, Channel 4; 15:19, Channel 4; 15:45, Channel 14 (FE transcripts). 
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Recommendation 


decided well before August 
14 to replace it with one 
from another vendor. 


33, page 164 


FE personnel told the investigation team that the 
alarm processing application had failed on occa¬ 
sions prior to August 14, leading to loss of the 
alarming of system conditions and events for FE’s 
operators. 16 However, FE said that the mode and 
behavior of this particular failure event were both 
first time occurrences and ones which, at the time, 
FE’s IT personnel neither recognized nor knew 
how to correct. FE staff told investigators that it 
was only during a post-outage support call with 
GE late on 14 August that FE and GE determined 
that the only available course of action to correct 
the alarm problem was a “cold reboot” 17 of FE’s 
overall XA21 system. In interviews immediately 
after the blackout, FE IT personnel indicated that 
they discussed a cold reboot of the XA21 system 
with control room operators after they were told of 
the alarm problem at 15:42 EDT, but decided not 
to take such action because operators considered 
power system conditions precarious, were con¬ 
cerned about the length of time that the reboot 
might take to complete, and understood that a cold 
reboot would leave them with even less EMS func¬ 
tionality until it was completed. 18 


Clues to the EMS Problems. There is an entry in 
FE’s western desk operator’s log at 14:14 EDT 
referring to the loss of alarms, but it is not clear 
whether that entry was made at that time or subse¬ 
quently, referring back to the last known alarm. 
There is no indication that the operator mentioned 
the problem to other control 
room staff and supervisors 
or to FE’s IT staff. 


Recommendation 


26, page 161 


The first clear hint to FE control room staff of any 
computer problems occurred at 14:19 EDT when a 
caller and an FE control room operator discussed 
the fact that three sub-transmission center 
dial-ups had failed. 19 At 14:25 EDT, a control 
room operator talked with a caller about the fail¬ 
ure of these three remote EMS consoles. 20 The 
next hint came at 14:32 EDT, when FE scheduling 
staff spoke about having made schedule changes 
to update the EMS pages, but that the totals did 
not update. 21 


room staff either when they began work on the 
servers at 14:54 EDT, or when they completed the 
primary server restart at 15:08 EDT. At 15:42 EDT, 
the IT staff were first told of the alarm problem by 
a control room operator; FE has stated to investiga¬ 
tors that their IT staff had been unaware before 
then that the alarm processing sub-system of the 
EMS was not working. 

Without the EMS systems, the only remaining 
ways to monitor system conditions would have 
been through telephone calls and direct analog 
telemetry. FE control room personnel did not real¬ 
ize that alarm processing on their EMS was not 
working and, subsequently, did not monitor other 
available telemetry. 


Cause 2 


Inadequate 

Situational 

Awareness 


During the afternoon of August 
14, FE operators talked to their 
field personnel, MISO, PJM (con¬ 
cerning an adjoining system in 
PJM’s reliability coordination 
region), adjoining systems (such as AEP), and cus¬ 
tomers. The FE operators received pertinent infor¬ 
mation from all these sources, but did not 
recognize the emerging problems from the clues 
offered. This pertinent information included calls 
such as that from FE’s eastern control center ask¬ 
ing about possible line trips, FE Perry nuclear 
plant calls regarding what looked like nearby line 
trips, AEP calling about their end of the Star-South 
Canton line tripping, and 


Recommendations 


MISO and PJM calling about 19 , page 156; 26, page 1611 
possible line overloads. 


Without a functioning alarm system, the FE con¬ 
trol area operators failed to detect the tripping of 
electrical facilities essential to maintain the secu¬ 
rity of their control area. Unaware of the loss of 
alarms and a limited EMS, they made no alternate 
arrangements to monitor the system. When AEP 
identified the 14:27 EDT circuit trip and reclosure 
of the Star 345 kV line circuit breakers at AEP’s 
South Canton substation, the FE operator dis¬ 
missed the information as either not accurate or 
not relevant to his system, without following up 
on the discrepancy between the AEP event and the 
information from his own tools. There was no sub¬ 
sequent verification of conditions with the MISO 
reliability coordinator. 


Although FE’s IT staff would have 
been aware that concurrent loss 
of its servers would mean the loss 
of alarm processing on the EMS, 
the investigation team has found 
no indication that the IT staff informed the control 


Only after AEP notified FE that a 345-kV circuit 
had tripped and locked out did the FE control 
area operator compare this information to 
actual breaker conditions. FE failed to inform its 
reliability coordinator and adjacent control areas 
when they became aware that system conditions 


Cause 2 


Inadequate 

Situational 

Awareness 
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had changed due to un¬ 
scheduled equipment out¬ 
ages that might affect other 
control areas. 



Phase 3: 

Three FE 345-kV 
Transmission Line Failures 
and Many Phone Calls: 
15:05 EDT to 15:57 EDT 


3D) 15:35 EDT: AEP asked PJM to begin work on a 
350-MW TLR to relieve overloading on the 
Star-South Canton line, not knowing the 
Hanna-Juniper 345-kV line had already trip¬ 
ped at 15:32 EDT. 

3E) 15:36 EDT: MISO called FE regarding 
post-contingency overload on Star-Juniper 
345-kV line for the contingency loss of the 
Hanna-Juniper 345-kV line, unaware at the 
start of the call that Hanna-Juniper had 
already tripped. 


Overview of This Phase 

From 15:05:41 EDT to 15:41:35 EDT, three 345-kV 
lines failed with power flows at or below each 
transmission line’s emergency rating. These line 
trips were not random. Rather, each was the result 
of a contact between a line and a tree that 
had grown so tall that, over a period of years, it 
encroached into the required clearance height for 
the line. As each line failed, its outage increased 
the loading on the remaining lines (Figure 5.5). As 
each of the transmission lines failed, and power 
flows shifted to other transmission paths, voltages 
on the rest of FE’s system degraded further (Figure 
5.6). 

Key Phase 3 Events 

3A) 15:05:41 EDT: Harding-Chamberlin 345-kV 
line tripped. 

3B) 15:31-33 EDT: MISO called PJM to determine 
if PJM had seen the Stuart-Atlanta 345-kV 
line outage. PJM confirmed Stuart-Atlanta 
was out. 

3C) 15:32:03 EDT: Hanna-Juniper 345-kV line 
tripped. 


3F) 15:41:33-41 EDT: Star-South Canton 345-kV 
tripped, reclosed, tripped again at 15:41:35 
EDT and remained out of service, all while 
AEP and PJM were discussing TLR relief 
options (event 3D). 

Transmission lines are designed with the expecta¬ 
tion that they will sag lower when they become 
hotter. The transmission line gets hotter with 
heavier line loading and under higher ambient 
temperatures, so towers and conductors are 
designed to be tall enough and conductors pulled 
tightly enough to accommodate expected sagging 
and still meet safety requirements. On a summer 
day, conductor temperatures can rise from 60°C 
on mornings with average wind to 100°C with hot 
air temperatures and low wind conditions. 

A short-circuit occurred on the Harding- 
Chamberlin 345-kV line due to a contact between 
the line conductor and a tree. This line failed with 
power flow at only 44% of its normal and emer¬ 
gency line rating. Incremental line current and 
temperature increases, escalated by the loss of 
Harding-Chamberlin, caused more sag on the 
Hanna-Juniper line, which contacted a tree and 
failed with power flow at 88% of its normal 
and emergency line rating. Star-South Canton 


Figure 5.6. Voltages on FirstEnergy’s 345-kV Lines: 
Figure 5.5. FirstEnergy 345-kV Line Flows Impacts of Line Trips 
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Figure 5.7. Timeline Phase 3 
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contacted a tree three times between 14:27:15 EDT 
and 15:41:33 EDT, opening and reclosing each 
time before finally locking out while loaded at 
93% of its emergency rating at 15:41:35 EDT. Each 
of these three lines tripped not because of exces¬ 
sive sag due to overloading or high conductor tem¬ 
perature, but because it hit an overgrown, 
untrimmed tree. 22 

Overgrown trees, as opposed to 
excessive conductor sag, caused 
each of these faults. While sag 
may have contributed to these 
events, these incidents occurred 
because the trees grew too tall and encroached 
into the space below the line which is intended 
to be clear of any objects, not because the lines 
sagged into short trees. Because the trees were so 
tall (as discussed below), each of these lines 
faulted under system conditions well within spec¬ 
ified operating parameters. The investigation team 
found field evidence of tree contact at all three 
locations, including human observation of the 
Hanna-Juniper contact. Evidence outlined below 
confirms that contact with trees caused the short 
circuits to ground that caused each line to trip out 
on August 14. 

To be sure that the evidence of tree/line contacts 
and tree remains found at each site was linked to 
the events of August 14, the team looked at 
whether these lines had any prior history of out¬ 
ages in preceding months or years that might have 
resulted in the burn marks, debarking, and other 
vegetative evidence of line contacts. The record 
establishes that there were no prior sustained out¬ 
ages known to be caused by trees for these lines in 
2001, 2002, and 2003. 23 


Like most transmission owners, FE patrols its lines 
regularly, flying over each transmission line twice 
a year to check on the condition of the 
rights-of-way. Notes from fly-overs in 2001 and 
2002 indicate that the examiners saw a significant 
number of trees and brush that needed clearing or 


Line Ratings 

A conductor’s normal rating reflects how 
heavily the line can be loaded under routine 
operation and keep its internal temperature 
below a certain temperature (such as 90°C). A 
conductor’s emergency rating is often set to 
allow higher-than-normal power flows, but to 
limit its internal temperature to a maximum 
temperature (such as 100°C) for no longer than a 
specified period, so that it does not sag too low 
or cause excessive damage to the conductor. 

For three of the four 345-kV lines that failed, 
FE set the normal and emergency ratings at the 
same level. Many of FE’s lines are limited by the 
maximum temperature capability of its termi¬ 
nal equipment, rather than by the maximum 
safe temperature for its conductors. In calculat¬ 
ing summer emergency ampacity ratings for 
many of its lines, FE assumed 90°F (32°C) ambi¬ 
ent air temperatures and 6.3 ft/sec (1.9 m/sec) 
wind speed, 3 which is a relatively high wind 
speed assumption for favorable wind cooling. 
Actual temperature on August 14 was 87°F 
(31°C) but wind speed at certain locations in the 
Akron area was somewhere between 0 and 2 
ft/sec (0.6 m/sec) after 15:00 EDT that afternoon. 

FirstEnergy Transmission Planning Criteria (Revision 8), 
page 3. 
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tr immin g along many FE transmission lines. Notes 
from fly-overs in the spring of 2003 found fewer 
problems, suggesting that fly-overs do not allow 
effective identification of the distance between a 

tree and the line above it, _ 

and need to be supple¬ 
mented with ground patrols. 


Recommendations 


16, page 154; 27, page 1621 


3A) FE’s Harding-Chamberlin 345-kV Line 
Tripped: 15:05 EDT 

At 15:05:41 EDT, FE’s Harding-Chamberlin line 
(Figure 5.8) tripped and locked out while loaded at 
44% of its normal and emergency rating. At this 
low loading, the line temperature would not 
exceed safe levels—even if still air meant there 


Utility Vegetation Management: When Trees and Lines Contact 


Vegetation management is critical to any utility 
company that maintains overhead energized 
lines. It is important and relevant to the August 
14 events because electric power outages occur 
when trees, or portions of trees, grow up or fall 
into overhead electric power lines. While not all 
outages can be prevented (due to storms, heavy 
winds, etc.), some outages can be mitigated or 
prevented by managing the vegetation before it 
becomes a problem. When a tree contacts a 
power line it causes a short circuit, which is read 
by the line’s relays as a ground fault. Direct phys¬ 
ical contact is not necessary for a short circuit to 
occur. An electric arc can occur between a part of 
a tree and a nearby high-voltage conductor if a 
sufficient distance separating them is not main¬ 
tained. Arcing distances vary based on such fac¬ 
tors such as voltage and ambient wind and 
temperature conditions. Arcs can cause fires as 
well as short circuits and line outages. 

Most utilities have right-of-way and easement 
agreements allowing them to clear and maintain 
vegetation as needed along their lines to provide 
safe and reliable electric power. Transmission 
easements generally give the utility a great deal 
of control over the landscape, with extensive 
rights to do whatever work is required to main¬ 
tain the lines with adequate clearance through 
the control of vegetation. The three principal 
means of managing vegetation along a transmis¬ 
sion right-of-way are pruning the limbs adjacent 
to the line clearance zone, removing vegetation 
completely by mowing or cutting, and using her¬ 
bicides to retard or kill further growth. It is com¬ 
mon to see more tree and brush removal using 
mechanical and chemical tools and relatively 
less pruning along transmission rights-of-way. 

FE’s easement agreements establish extensive 
rights regarding what can be pruned or removed 


in these transmission rights-of-way, including: 
“the right to erect, inspect, operate, replace, relo¬ 
cate, repair, patrol and permanently maintain 
upon, over, under and along the above described 
right of way across said premises all necessary 
structures, wires, cables and other usual fixtures 
and appurtenances used for or in connection 
with the transmission and distribution of electric 
current, including telephone and telegraph, and 
the right to trim, cut, remove or control by any 
other means at any and all times such trees, limbs 
and underbrush within or adjacent to said right 
of way as may interfere with or endanger said 
structures, wires or appurtenances, or their oper¬ 
ations.” 3 

FE uses a 5-year cycle for transmission line vege¬ 
tation maintenance (i.e., it completes all required 
vegetation work within a 5-year period for all cir¬ 
cuits). A 5-year cycle is consistent with industry 
practices, and it is common for transmission pro¬ 
viders not to fully exercise their easement rights 
on transmission rights-of-way due to landowner 
or land manager opposition. 

A detailed study prepared for this investigation, 
“Utility Vegetation Management Final Report,” 
concludes that although FirstEnergy’s vegetation 
management practices are within common or 
average industry practices, those common indus¬ 
try practices need significant improvement to 
assure greater transmission reliability. 15 The 
report further recommends that strict regulatory 
oversight and support will be required for utili¬ 
ties to improve and sustain needed improve¬ 
ments in their vegetation management programs. 

NERC has no standards or requirements for vege¬ 
tation management or transmission right-of-way 
clearances, nor for the determination of line 
ratings. 


a Standard language in FE’s right-of-way easement agreement. 

b “Utility Vegetation Management Final Report,” CN Utility Consulting, March 2004. 
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Cause 3 


Inadequate 

Tree 

Trimming 


was no wind cooling of the con¬ 
ductor—and the line would not 
sag excessively. The investigation 
team examined the relay data for 
this trip, identified the geo¬ 
graphic location of the fault, and determined that 
the relay data match the classic “signature” pat¬ 
tern for a tree/line short circuit to ground fault. 
The field team found the remains of trees and 
brush at the fault location determined from the 
relay data. At this location, conductor height mea¬ 
sured 46 feet 7 inches (14.20 meters), while the 
height of the felled tree measured 42 feet (12.80 
meters); however, portions of the tree had been 
removed from the site. This means that while it is 
difficult to determine the exact height of the line 
contact, the measured height is a minimum and 
the actual contact was likely 3 to 4 feet (0.9 to 1.2 
meters) higher than estimated here. Burn marks 
were observed 35 feet 8 inches (10.87 meters) up 
the tree, and the crown of this tree was at least 6 
feet (1.83 meters) taller than the observed burn 
marks. The tree showed evi¬ 
dence of fault current dam¬ 
age. 


24 


Recommendations 


16, page 154; 27, page 1621 


When the Harding-Chamberlin line locked out, 
the loss of this 345-kV path caused the remaining 
three southern 345-kV lines into Cleveland to pick 
up more load, with Hanna-Juniper picking up 
the most. The Harding-Chamberlin outage also 
caused more power to flow through the underly¬ 
ing 138-kV system. 


MISO did not discover that Har¬ 
ding-Chamberlin had tripped 
until after the blackout, when 
MISO reviewed the breaker oper¬ 
ation log that evening. FE indi¬ 
cates that it discovered the line was out while 
investigating system conditions in response to 
MISO’s call at 15:36 EDT, when MISO told FE 


Cause 2 


Inadequate 

Situational 

Awareness 


that MISO’s flowgate monitoring tool showed 
a Star-Juniper line overload following a contin¬ 
gency loss of Hanna-Juniper; 25 however, the 
investigation team has found no evidence within 
the control room logs or transcripts to show that 
FE knew of the Harding- 
Chamberlin line failure 
until after the blackout. 


Recommendation 


22, page 159 


Harding-Chamberlin was not one 
of the flowgates that MISO moni¬ 
tored as a key transmission loca¬ 
tion, so the reliability coordinator 
was unaware when FE’s first 
345-kV line failed. Although MISO received 


Cause 4 


Inadequate 
RC Diagnostic 
Support 


Figure 5.8. Harding-Chamberlin 345-kV Line 



SCADA input of the line’s status change, this was 
presented to MISO operators as breaker status 
changes rather than a line failure. Because their 
EMS system topology processor had not yet been 
linked to recognize line failures, it did not connect 
the breaker information to the loss of a transmis¬ 
sion line. Thus, MISO’s operators did not recog¬ 
nize the Harding-Chamberlin trip as a significant 
contingency event and could not advise FE regard¬ 
ing the event or its consequences. Further, with¬ 
out its state estimator and associated contingency 
analyses, MISO was unable to identify potential 
overloads that would occur due to various line or 
equipment outages. Accordingly, when the Har¬ 
ding-Chamberlin 345-kV line tripped at 15:05 
EDT, the state estimator did not produce results 

and could not predict an _ 

overload if the Hanna- 


Recommendation 


Juniper 345-kV line were to 
fail. 


3C) FE’s Hanna-Juniper 345-kV Line Tripped: 
15:32 EDT 


At 15:32:03 EDT the Hanna- 
Juniper line (Figure 5.9) tripped 
and locked out. A tree-trimming 
crew was working nearby and 
observed the tree/line contact. 
The tree contact occurred on the south phase, 
which is lower than the center phase due to 
construction design. Although little evidence 
remained of the tree during the field team’s visit in 
October, the team observed a tree stump 14 inches 
(35.5 cm) in diameter at its ground line and talked 
to an individual who witnessed the contact on 
August 14. 26 Photographs clearly indicate that the 
tree was of excessive height (Figure 5.10). Sur¬ 
rounding trees were 18 inches (45.7 cm) in diame¬ 
ter at ground line and 60 feet (18.3 meters) in 
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height (not near lines). Other sites at this location 
had numerous (at least 20) trees in this right- 
of-way. 

Hanna-Juniper was loaded at 88% of its normal 
and emergency rating when it tripped. With this 
line open, over 1,200 MVA of power flow had to 
find a new path to reach its load in Cleveland. 
Loading on the remaining two 345-kV lines 
increased, with Star-Juniper taking the bulk of the 
power. This caused Star-South Canton’s loading 
to rise above its normal but within its emergency 
rating and pushed more power onto the 138-kV 
system. Flows west into Michigan decreased 
slightly and voltages declined somewhat in the 
Cleveland area. 


= igure 5.9. Hanna-Juniper 345-kV Line 



Why Did So Many Tree-to-Line Contacts Happen on August 14? 


Tree-to-line contacts and resulting transmission 
outages are not unusual in the summer across 
much of North America. The phenomenon 
occurs because of a combination of events occur¬ 
ring particularly in late summer: 

♦ Most tree growth occurs during the spring and 
summer months, so the later in the summer 
the taller the tree and the greater its potential 
to contact a nearby transmission line. 

♦ As temperatures increase, customers use more 
air conditioning and load levels increase. 
Higher load levels increase flows on the trans¬ 
mission system, causing greater demands for 
both active power (MW) and reactive power 
(MVAr). Higher flow on a transmission line 
causes the line to heat up, and the hot line sags 
lower because the hot conductor metal 
expands. Most emergency line ratings are set 
to limit conductors’ internal temperatures to 
no more than 100°C (212°F). 


♦ As temperatures increase, ambient air temper¬ 
atures provide less cooling for loaded trans¬ 
mission lines. 

♦ Wind flows cool transmission lines by increas¬ 
ing the airflow of moving air across the line. 
On August 14 wind speeds at the Ohio 
Akron-Fulton airport averaged 5 knots (1.5 
m/sec) at around 14:00 EDT, but by 15:00 EDT 
wind speeds had fallen to 2 knots (0.6 m/sec)— 
the wind speed commonly assumed in con¬ 
ductor design—or lower. With lower winds, 
the lines sagged further and closer to any tree 
limbs near the lines. 

This combination of events on August 14 across 
much of Ohio and Indiana caused transmission 
lines to heat and sag. If a tree had grown into a 
power line’s designed clearance area, then a 
tree/line contact was more likely, though not 
inevitable. An outage on one line would increase 
power flows on related lines, causing them to be 
loaded higher, heat further, and sag lower. 



O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations 


61 


































3D) AEP and PJM Begin Arranging a TLRfor 
Star-South Canton: 15:35 EDT 

Because its alarm system was not 
working, FE was not aware of the 
Harding-Chamberlin or Hanna- 
Juniper line trips. However, once 
MISO manually updated the state 
estimator model for the Stuart-Atlanta 345-kV line 
outage, the software successfully completed a 
state estimation and contingency analysis at 15:41 
EDT. But this left a 36 minute period, from 15:05 
EDT to 15:41 EDT, during which MISO did not 
recognize the consequences of the Hanna-Juniper 
loss, and FE operators knew neither of the line’s 
loss nor its consequences. PJM and AEP recog¬ 
nized the overload on Star-South Canton, but had 
not expected it because their earlier contingency 
analysis did not examine enough lines within the 
FE system to foresee this result of the Hanna- 
Juniper contingency on top of the Harding- 
Chamberlin outage. 

After AEP recognized the Star-South Canton over¬ 
load, at 15:35 EDT AEP asked PJM to begin 


developing a 350 MW TLR to mitigate it. The TLR 
was to relieve the actual overload above normal 
rating then occurring on Star-South Canton, and 
prevent an overload above emergency rating on 


Figure 5.10. Cause of the Hanna-Juniper Line Loss 



This August 14 photo shows the tree that caused the loss of 
the Hanna-Juniper line (tallest tree in photo). Other 345-kV 
conductors and shield wires can be seen in the background. 
Photo by Nelson Tree. 


Cause 4 


Inadequate 
RC Diagnostic 
Support 


Handling Emergencies by Shedding Load and Arranging TLRs 


Transmission loading problems. Problems such 
as contingent overloads of normal ratings are 
typically handled by arranging Transmission 
Loading Relief (TLR) measures, which in most 
cases take effect as a schedule change 30 to 60 
minutes after they are issued. Apart from a TLR 
level 6, TLRs are intended as a tool to prevent the 
system from being operated in an unreliable 
state, 3 and are not applicable in real-time emer¬ 
gency situations because it takes too long to 
implement reductions. Actual overloads and vio¬ 
lations of stability limits need to be handled 
immediately under TLR level 4 or 6 by redis¬ 
patching generation, system reconfiguration or 
tripping load. The dispatchers at FE, MISO and 
other control areas or reliability coordinators 
have authority—and under NERC operating poli¬ 
cies, responsibility—to take such action, but the 
occasion to do so is relatively rare. 

Lesser TLRs reduce scheduled transactions— 
non-firm first, then pro-rata between firm trans¬ 
actions, including flows that serve native load. 
When pre-contingent conditions are not solved 
with TLR levels 3 and 5, or conditions reach 
actual overloading or surpass stability limits, 
operators must use emergency generation 


redispatch and/or load-shedding under TLR level 
6 to return to a secure state. After a secure state is 
reached, TLR level 3 and/or 5 can be initiated to 
relieve the emergency generation redispatch or 
load-shedding activation. 

System operators and reliability coordinators, by 
NERC policy, have the responsibility and the 
authority to take actions up to and including 
emergency generation redispatch and shedding 
firm load to preserve system security. On August 
14, because they either did not know or under¬ 
stand enough about system conditions at the 
time, system operators at FE, MISO, PJM, or AEP 
did not call for emergency actions. 

Use of automatic procedures in voltage-related 
emergencies. There are few automatic safety nets 
in place in northern Ohio except for under¬ 
frequency load-shedding in some locations. In 
some utility systems in the U.S. Northeast, 
Ontario, and parts of the Western Interconnec¬ 
tion, special protection systems or remedial 
action schemes, such as under-voltage load¬ 
shedding are used to shed load under defined 
severe contingency conditions similar to those 
that occurred in northern Ohio on August 14. 


a “Northern MAPP/Northwestern Ontario Disturbance-June 25, 1998,” NERC 1998 Disturbance Report, page 17. 
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that line if the Sammis-Star line were to fail. But 
when they began working on the TLR, neither AEP 
nor PJM realized that the Hanna-Juniper 345-kV 
line had already tripped at 15:32 EDT, further 
degrading system conditions. Since the great 
majority of TLRs are for cuts of 25 to 50 MW, a 350 
MW TLR request was highly unusual and opera¬ 
tors were attempting to confirm why so much 
relief was suddenly required before implementing 
the requested TLR. Less than ten minutes elapsed 
between the loss of Hanna-Juniper, the overload 
above the normal limits of 
Star-South Canton, and the 
Star-South Canton trip and 
lock-out. 


Recommendations 


6 , page 147; 22, page 159; I 
30, page 163; 31, page 1631 


Unfortunately, neither AEP nor 
PJM recognized that even a 350 
MW TLR on the Star-South Can¬ 
ton line would have had little 
impact on the overload. Investi¬ 
gation team analysis using the Interchange Distri¬ 
bution Calculator (which was fully available on 
the afternoon of August 14) indicates that tagged 
transactions for the 15:00 EDT hour across Ohio 
had minimal impact on the overloaded lines. As 
discussed in Chapter 4, this analysis showed that 
after the loss of the Hanna-Juniper 345 kV line, 
Star-South Canton was loaded primarily with 
flows to serve native and network loads, deliver¬ 
ing makeup energy for the loss of Eastlake 5, pur¬ 
chased from PJM (342 MW) and Ameren (126 
MW). The only way that these high loadings could 
have been relieved would not have been from the 
redispatch that AEP requested, but rather from sig¬ 
nificant load-shedding by FE in the Cleveland 
area. 


Cause 2 


Inadequate 

Situational 

Awareness 


The primary tool MISO uses for 
assessing reliability on key 
flowgates (specified groupings of 
transmission lines or equipment 
that sometimes have less transfer 
capability than desired) is the flowgate monitoring 
tool. After the Harding-Chamberlin 345-kV line 
outage at 15:05 EDT, the flowgate monitoring tool 
produced incorrect (obsolete) results, because the 
outage was not reflected in the model. As a result, 
the tool assumed that Harding-Chamberlin was 
still available and did not predict an overload for 
loss of the Hanna-Juniper 345-kV line. When 
Hanna-Juniper tripped at 15:32 EDT, the resulting 
overload was detected by MISO’s SCADA and set 
off alarms to MISO’s system operators, who then 
phoned FE about it. 27 Because both MISO’s 
state estimator and its flowgate monitoring tool 


Cause 4 


Inadequate 
RC Diagnostic 
Support 


were not working properly, 
MISO’s ability to recognize 
FE’s evolving contingency 
situation was impaired. 



3F) Loss of the Star-South Canton 345-kV Line: 
15:41 EDT 


The Star-South Canton line (Figure 5.11) crosses 
the boundary between FE and AEP—each com¬ 
pany owns the portion of the line and manages the 
right-of-way within its respective territory. The 
Star-South Canton line tripped and reclosed three 
times on the afternoon of August 14, first at 
14:27:15 EDT while carrying less than 55% of its 
emergency rating (reclosing at both ends), then at 
15:38:48 and again at 15:41:33 EDT. These multi¬ 
ple contacts had the effect of “electric 
tree-trimming,” burning back the contacting limbs 
temporarily and allowing the line to carry more 
current until further sag in the still air caused the 
final contact and lock-out. At 15:41:35 EDT the 
line tripped and locked out at the Star substation, 
with power flow at 93% of its emergency rating. A 
short-circuit to ground occurred in each case. 

The investigation’s field team 
inspected the right of way in the 
location indicated by the relay 
digital fault recorders, in the FE 
portion of the line. They found 
debris from trees and vegetation that had been 
felled. At this location the conductor height 
was 44 feet 9 inches (13.6 meters). The identifiable 
tree remains measured 30 feet (9.1 meters) in 
height, although the team could not verify the 
location of the stump, nor find all sections of the 
tree. A nearby cluster of trees showed significant 
fault damage, including charred limbs and 
de-barking from fault current. Further, topsoil in 
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Figure 5.11. Star-South Canton 345-kV Line 
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the area of the tree trunk was disturbed, discolored 
and broken up, a common indication of a higher 
magnitude fault or multiple faults. Analysis of 
another stump showed that a fourteen year-old 
tree had recently been 
removed from the middle of 
the right-of-way. 28 


Recommendations 


16, page 154; 27, page 1621 


After the Star-South Canton line was lost, flows 
increased greatly on the 138-kV system toward 
Cleveland and area voltage levels began to degrade 
on the 138-kV and 69-kV system. At the same 
time, power flows increased on the Sammis-Star 
345-kV line due to the 138-kV line trips—the only 
remaining paths into Cleveland from the south. 

FE’s operators were not aware that 
the system was operating outside 
first contingency limits after the 
Harding-Chamberlin trip (for the 
possible loss of Hanna-Juniper or 
the Perry unit), because they did not conduct 
a contingency analysis. 29 The investigation team 
has not determined whether the system status 
information used by FE’s 
state estimator and contin¬ 
gency analysis model was 
being accurately updated. 


Cause 2 


Inadequate 

Situational 

Awareness 


Recommendation 


22,page159 


Cause 1 


Inadequate 

System 

Understanding 


Load-Shed Analysis. The investi¬ 
gation team looked at whether it 
would have been possible to pre¬ 
vent the blackout by shedding 
load within the Cleveland-Akron 
area before the Star-South Canton 345 kV line trip¬ 
ped at 15:41 EDT. The team modeled the system 
assuming 500 MW of load shed within the Cleve¬ 
land-Akron area before 15:41 EDT and found that 
this would have improved voltage at the Star bus 
from 91.7% up to 95.6%, pulling the line loading 
from 91 to 87% of its emergency ampere rating; an 
additional 500 MW of load would have had to be 
dropped to improve Star voltage to 96.6% and the 
line loading to 81% of its emergency ampere rat¬ 
ing. But since the Star-South Canton line had 
already been compromised by the tree below it 
(which caused the first two trips and reclosures), 
and was about to trip from tree contact a third 
time, it is not clear that had such load shedding 
occurred, it would have prevented the ultimate 
trip and lock-out of the line. However, modeling 
indicates that this load shed 
would have prevented the 
subsequent tripping of the 
Sammis-Star line (see page 
70). 


Recommendations 


8 , page 147; 21, page 158 I 


System impacts of the 345-kV 
failures. According to extensive 
investigation team modeling, 
there were no contingency limit 
violations as of 15:05 EDT before 
the loss of the Harding-Chamberlin 345-kV line. 
Figure 5.12 shows the line loadings estimated by 
investigation team modeling as the 345-kV lines in 
northeast Ohio began to trip. Showing line load¬ 
ings on the 345-kV lines as a percent of normal rat¬ 
ing, it tracks how the loading on each line 
increased as each subsequent 345-kV and 138-kV 
line tripped out of service between 15:05 EDT 
(Harding-Chamberlin, the first line above to 
stair-step down) and 16:06 EDT (Dale-West Can¬ 
ton). As the graph shows, none of the 345- or 
138-kV lines exceeded their normal ratings until 
after the combined trips of Harding-Chamberlin 
and Hanna-Juniper. But immediately after the sec¬ 
ond line was lost, Star-South Canton’s loading 
jumped from an estimated 82% of normal to 120% 
of normal (which was still below its emergency 
rating) and remained at the 120% level for 10 min¬ 
utes before tripping out. To the right, the graph 
shows the effects of the 138-kV line failures 
(discussed in the next phase) upon the 
two remaining 345-kV lines—i.e., Sammis-Star’s 
loading increased steadily above 100% with each 
succeeding 138-kV line lost. 

Following the loss of the Harding-Chamberlin 
345-kV line at 15:05 EDT, contingency limit viola¬ 
tions existed for: 

♦ The Star-Juniper 345-kV line, whose loadings 
would exceed emergency limits if the Hanna- 
Juniper 345-kV line were lost; and 
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Figure 5.12. Cumulative Effects of Sequential 
Outages on Remaining 345-kV Lines 
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♦ The Hanna-Juniper and Harding-Juniper 
345-kV lines, whose loadings would exceed 
emergency limits if the Perry generation unit 
(1,255 MW) were lost. 


call at 14:51 EDT, expressing concern that they 
had not seen any indication of an operation, but 
lacking evidence within their control room, the FE 
operators did not pursue the issue. 


Operationally, once FE’s system entered an N-l 
contingency violation state, any facility loss 
beyond that pushed them farther into violation 
and into a more unreliable state. After loss of the 
Harding-Chamberlin line, to avoid violating NERC 
criteria, FE needed to reduce loading on these 
three lines within 30 minutes such that no single 
contingency would violate an emergency limit; 
that is, to restore the system to a reliable operating 
mode. 


Phone Calls into the FE Control Room 


Cause 2 


Inadequate 

Situational 

Awareness 


Beginning at 14:14 EDT when 
their EMS alarms failed, and until 
at least 15:42 EDT when they 
began to recognize their situation, 
FE operators did not understand 
how much of their system was being lost, and did 
not realize the degree to which their perception of 
their system was in error versus true system con¬ 
ditions, despite receiving clues via phone calls 
from AEP, PJM and MISO, and customers. The FE 
operators were not aware of line outages that 
occurred after the trip of Eastlake 5 at 13:31 EDT 
until approximately 15:45 EDT, although they 
were beginning to get external input describing 
aspects of the system’s weakening condition. 
Since FE’s operators were not aware and did not 
recognize events as they 
were occurring, they took 
no actions to return the sys¬ 
tem to a reliable state. 


Recommendations 


19, page 156; 26, page 1611 


A brief description follows of some of the calls FE 
operators received concerning system problems 
and their failure to recognize that the problem was 
on their system. For ease of presentation, this set 
of calls extends past the time of the 345-kV line 
trips into the time covered in the next phase, when 
the 138-kV system collapsed. 

Following the first trip of the Star-South Canton 
345-kV line at 14:27 EDT, AEP called FE at 14:32 
EDT to discuss the trip and reclose of the line. AEP 
was aware of breaker operations at their end 
(South Canton) and asked about operations at FE’s 
Star end. FE indicated they had seen nothing at 
their end of the line, but AEP reiterated that the 
trip occurred at 14:27 EDT and that the South Can¬ 
ton breakers had reclosed successfully. 30 There 
was an internal FE conversation about the AEP 


At 15:19 EDT, AEP called FE back to confirm that 
the Star-South Canton trip had occurred and that 
AEP had a confirmed relay operation from the site. 
FE’s operator restated that because they had 
received no trouble or alarms, they saw no prob¬ 
lem. An AEP technician at the South Canton sub¬ 
station verified the trip. At 15:20 EDT, AEP 
decided to treat the South Canton digital fault 
recorder and relay target information as a “fluke,” 
and checked the carrier relays to determine what 
the problem might be. 31 

At 15:35 EDT the FE control center received a call 
from the Mansfield 2 plant operator concerned 
about generator fault recorder triggers and excita¬ 
tion voltage spikes with an alarm for 
over-excitation, and a dispatcher called reporting 
a “bump” on their system. Soon after this call, FE’s 
Reading, Pennsylvania control center called 
reporting that fault recorders in the Erie west and 
south areas had activated, wondering if something 
had happened in the Ashtabula-Perry area. The 
Perry nuclear plant operator called to report a 
“spike” on the unit’s main transformer. When he 
went to look at the metering it was “still bouncing 
around pretty good. I’ve got it relay tripped up 
here ... so I know something ain’t right.” 32 


Beginning at this time, the FE operators began to 
think that something was wrong, but did not rec¬ 
ognize that it was on their system. “It’s got to be in 
distribution, or something like that, or somebody 
else’s problem . . . but I’m not showing any¬ 
thing.” 33 Unlike many other transmission grid 
control rooms, FE’s control center did not have a 
map board (which shows schematically all major 
lines and plants in the control area on the wall in 
front of the operators), which might have shown 
the location of significant 
line and facility outages 
within the control area. 


Recommendation 


22, page 159 


At 15:36 EDT, MISO contacted FE regarding the 
post-contingency overload on Star-Juniper for the 
loss of the Hanna-Juniper 345-kV line. 34 

At 15:42 EDT, FE’s western transmission operator 
informed FE’s IT staff that the EMS system func¬ 
tionality was compromised. “Nothing seems to be 
updating on the computers .... We’ve had people 
calling and reporting trips and nothing seems to be 
updating in the event summary ... I think we’ve 
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got something seriously sick.” This is the first evi¬ 
dence that a member of FE’s control room staff rec¬ 
ognized any aspect of their degraded EMS system. 
There is no indication that he informed any of the 
other operators at this moment. However, FE’s IT 
staff discussed the subsequent EMS alarm correc¬ 
tive action with some control room staff shortly 
thereafter. 

Also at 15:42 EDT, the Perry plant operator called 
back with more evidence of problems. “I’m still 
getting a lot of voltage spikes and swings on the 
generator .... I don’t know how much longer 
we’re going to survive.” 35 

At 15:45 EDT, the tree trimming crew reported 
that they had witnessed a tree-caused fault on the 
Eastlake-Juniper 345-kV line; however, the actual 
fault was on the Hanna-Juniper 345-kV line in the 
same vicinity. This information added to the con¬ 
fusion in the FE control room, because the opera¬ 
tor had indication of flow on the Eastlake-Juniper 
line. 36 

After the Star-South Canton 345-kV line tripped a 
third time and locked out at 15:41:35 EDT, AEP 
called FE at 15:45 EDT to discuss and inform them 
that they had additional lines that showed over¬ 
load. FE recognized then that the Star breakers 
had tripped and remained open. 37 

At 15:46 EDT the Perry plant operator called the 
FE control room a third time to say that the unit 
was close to tripping off: “It’s not looking good.... 
We ain’t going to be here much longer and you’re 
going to have a bigger problem.” 38 

At 15:48 EDT, an FE transmission operator sent 
staff to man the Star substation, and then at 15:50 
EDT, requested staffing at the regions, beginning 
with Beaver, then East Springfield. 39 

At 15:48 EDT, PJM called MISO to report the 
Star-South Canton trip, but the two reliability 
coordinators’ measures of the resulting line flows 
on FE’s Sammis-Star 345-kV line did not match, 
causing them to wonder whether the Star-South 
Canton 345-kV line had returned to service. 40 

At 15:56 EDT, because PJM was still concerned 
about the impact of the Star-South Canton trip, 
PJM called FE to report that Star-South Canton 
had tripped and that PJM thought FE’s 
Sammis-Star line was in actual emergency limit 
overload. 41 FE could not confirm this overload. FE 
informed PJM that Hanna-Juniper was also out 
service. FE believed that the problems existed 
beyond their system. “AEP must have lost some 
major stuff.” 42 


Emergency Action 

For FirstEnergy, as with many utilities, emergency 
awareness is often focused on energy shortages. 
Utilities have plans to reduce loads under these 
circumstances to increasingly greater degrees. 
Tools include calling for contracted customer load 
reductions, then public appeals, voltage reduc¬ 
tions, and finally shedding system load by cutting 
off interruptible and firm customers. FE has a plan 
for this that is updated yearly. While they can trip 
loads quickly where there is SCADA control of 
load breakers (although FE has few of these), from 
an energy point of view, the intent is to be able to 
regularly rotate what loads are not being served, 
which requires calling personnel out to switch the 
various groupings in and out. This event was not, 
however, a capacity or energy emergency or sys¬ 
tem instability, but an emergency due to transmis¬ 
sion line overloads. 

To handle an emergency effectively a dispatcher 
must first identify the emergency situation and 
then determine effective action. AEP identified 
potential contingency overloads at 15:36 EDT and 
called PJM even as Star-South Canton, one of the 
AEP/FE lines they were discussing, tripped and 
pushed FE’s Sammis-Star 345-kV line to its emer¬ 
gency rating. Since they had been focused on the 
impact of a Sammis-Star loss overloading Star- 
South Canton, they recognized that a serious prob¬ 
lem had arisen on the system for which they did 
not have a ready solution. Later, around 15:50 
EDT, their conversation reflected emergency con¬ 
ditions (138-kV lines were tripping and several 
other lines overloaded) but they still found no 
practical way to mitigate 
these overloads across util¬ 
ity and reliability coordina¬ 
tor boundaries. 



At the control area level, FE 
remained unaware of the precari¬ 
ous condition its system was in, 
with key lines out of service, 
degrading voltages, and severe 
overloads on their remaining lines. Transcripts 
show that FE operators were aware of falling volt¬ 
ages and customer problems after loss of the 
Hanna-Juniper 345-kV line (at 15:32 EDT). They 
called out personnel to staff substations because 
they did not think they could see them with their 
data gathering tools. They were also talking to cus¬ 
tomers. But there is no indication that FE’s opera¬ 
tors clearly identified their situation as a possible 
emergency until around 15:45 EDT when the shift 
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supervisor informed his manager that it looked as 
if they were losing the system; even then, although 
FE had grasped that its system was in trouble, it 
never officially declared that it was an emergency 
condition and that emergency or extraordinary 
action was needed. 


FE’s internal control room procedures and proto¬ 
cols did not prepare it adequately to identify and 
react to the August 14 emergency. Throughout the 
afternoon of August 14 there were many clues that 
FE had lost both its critical monitoring alarm func¬ 
tionality and that its transmission system’s reli¬ 
ability was becoming progressively more 
compromised. However, FE did not fully piece 
these clues together until after it had already lost 
critical elements of its transmission system and 
only minutes before subsequent trips triggered the 
cascade phase of the blackout. The clues to a com¬ 
promised EMS alarm system and transmission 
system came into the FE control room from FE 
customers, generators, AEP, MISO, and PJM. In 
spite of these clues, because 
of a number of related fac¬ 
tors, FE failed to identify 
the emergency that it faced. 


Recommendations 


20, page 158; 22, page 159;I 
26, page 161 


Cause 2 


Inadequate 

Situational 

Awareness 


The most critical factor delaying 
the assessment and synthesis of 
the clues was a lack of informa¬ 
tion sharing between the FE sys¬ 
tem operators. In interviews with 
the FE operators and analysis of phone transcripts, 
it is evident that rarely were any of the critical 
clues shared with fellow operators. This lack of 
information sharing can be 


Recommendation 


attributed to: 26, page 16i 


1. Physical separation of operators (the reliability 
operator responsible for voltage schedules was 
across the hall from the transmission 
operators). 


2. The lack of a shared electronic log (visible to 
all), as compared to FE’s practice of separate 
hand-written logs. 43 


3. Lack of systematic procedures to brief incoming 
staff at shift change times. 

4. Infrequent training of operators in emergency 
scenarios, identification and resolution of bad 
data, and the importance of sharing key infor¬ 
mation throughout the control room. 


FE has specific written procedures and plans for 
dealing with resource deficiencies, voltage 
depressions, and overloads, and these include 


instructions to adjust generators and trip firm 
loads. After the loss of the Star-South Canton line, 
voltages were below limits, and there were severe 
line overloads. But FE did not follow any of these 
procedures on August 14, because FE did not 
know for most of that time that its system might 
need such treatment. 


What training did the operators and reliability 
coordinators have for recognizing and responding 
to emergencies? FE relied upon on-the-job experi¬ 
ence as training for its operators in handling the 
routine business of a normal day, but had never 
experienced a major disturbance and had no simu¬ 
lator training or formal preparation for recogniz¬ 
ing and responding to emergencies. Although all 
affected FE and MISO operators were NERC- 
certified, NERC certification of operators 
addresses basic operational considerations but 
offers little insight into emergency operations 
issues. Neither group of operators had significant 
training, documentation, or actual experience for 
how to handle an emer¬ 
gency of this type and 
magnitude. 


Recommendation 


20 , page 158 


MISO was hindered because it 
lacked clear visibility, responsi¬ 
bility, authority, and ability to 
take the actions needed in this cir¬ 
cumstance. MISO had interpre¬ 
tive and operational tools and a large amount of 
system data, but had a limited view of FE’s system. 
In MISO’s function as FE’s reliability coordinator, 
its primary task was to initiate and implement 
TLRs, recognize and solve congestion problems in 
less dramatic reliability circumstances with 
longer solution time periods than those which 
existed on August 14, and provide assistance as 
requested. 

Throughout August 14, most major elements of 
FE’s EMS were working properly. The system was 
automatically transferring accurate real-time 
information about FE’s system conditions to com¬ 
puters at AEP, MISO, and PJM. FE’s operators did 
not believe the transmission line failures reported 
by AEP and MISO were real until 15:42 EDT, after 
FE conversations with the AEP and MISO control 
rooms and calls from FE IT staff to report the fail¬ 
ure of their alarms. At that point in time, FE opera¬ 
tors began to think that their system might be in 
jeopardy—but they did not act to restore any of the 
lost transmission lines, clearly alert their reliabil¬ 
ity coordinator or neighbors about their situation, 
or take other possible remedial measures (such as 
load- shedding) to stabilize their system. 
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Figure 5.13. Timeline Phase 4 
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Phase 4: 

138-kV Transmission System 
Collapse in Northern Ohio: 
15:39 to 16:08 EDT 

Overview of This Phase 

As each of FE’s 345-kV lines in the Cleveland area 
tripped out, it increased loading and decreased 
voltage on the underlying 138-kV system serving 
Cleveland and Akron, pushing those lines into 
overload. Starting at 15:39 EDT, the first of an 
eventual sixteen 138-kV lines began to fail (Figure 
5.13). Relay data indicate that each of these lines 
eventually ground faulted, which indicates that it 
sagged low enough to contact something below 
the line. 

Figure 5.14 shows how actual voltages declined at 
key 138-kV buses as the 345- and 138-kV lines 
were lost. As these lines failed, the voltage drops 
caused a number of large industrial customers 
with voltage-sensitive equipment to go off-line 
automatically to protect their operations. As the 
138-kV lines opened, they blacked out customers 
in Akron and the areas west and south of the city, 
ultimately dropping about 600 MW of load. 

Key Phase 4 Events 

Between 15:39 EDT and 15:58:47 EDT seven 
138-kV lines tripped: 

4A) 15:39:17 EDT: Pleasant Valley-West Akron 
138-kV line tripped and reclosed at both ends 
after sagging into an underlying distribution 
line. 


15:42:05 EDT: Pleasant Valley-West Akron 
138-kV West line tripped and reclosed. 

15:44:40 EDT: Pleasant Valley-West Akron 
138-kV West line tripped and locked out. 

4B) 15:42:49 EDT: Canton Central-Cloverdale 
138-kV line tripped on fault and reclosed. 

15:45:39 EDT: Canton Central-Cloverdale 
138-kV line tripped on fault and locked out. 

4C) 15:42:53 EDT: Cloverdale-Torrey 138-kV line 
tripped. 

4D) 15:44:12 EDT: East Lima-New Liberty 138-kV 
line tripped from sagging into an underlying 
distribution line. 

4E) 15:44:32 EDT: Babb-West Akron 138-kV line 
tripped on ground fault and locked out. 

4F) 15:45:40 EDT: Canton Central 345/138 kV 
transformer tripped and locked out due to 138 
kV circuit breaker operating multiple times, 

Figure 5.14. Voltages on FirstEnergy’s 138-kV 
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which then opened the line to FE’s Cloverdale 
station. 

4G) 15:51:41 EDT: East Lima-N. Findlay 138-kV 
line tripped, likely due to sagging line, and 
reclosed at East Lima end only. 

4H) 15:58:47 EDT: Chamberlin-West Akron 138- 
kV line tripped. 

Note: 15:51:41 EDT: Fostoria Central-N. 
Findlay 138-kV line tripped and reclosed, but 
never locked out. 

At 15:59:00 EDT, the loss of the West Akron bus 
tripped due to breaker failure, causing another 
five 138-kV lines to trip: 

41) 15:59:00 EDT: West Akron 138-kV bus trip¬ 
ped, and cleared bus section circuit breakers 
at West Akron 138 kV. 

4J) 15:59:00 EDT: West Akron-Aetna 138-kV line 
opened. 

4K) 15:59:00 EDT: Barberton 138-kV line opened 
at West Akron end only. West Akron-Bl8 
138-kV tie breaker opened, affecting West 
Akron 138/12-kV transformers #3, 4 and 5 fed 
from Barberton. 

4L) 15:59:00 EDT: West Akron-Granger-Stoney- 
Brunswick-West Medina opened. 

4M) 15:59:00 EDT: West Akron-Pleasant Valley 
138-kV East line (Q-22) opened. 

4N) 15:59:00 EDT: West Akron-Rosemont-Pine- 
Wadsworth 138-kV line opened. 

From 16:00 EDT to 16:08:59 EDT, four 138-kV 
lines tripped, and the Sammis-Star 345-kV line 
tripped due to high current and low voltage: 

40) 16:05:55 EDT: Dale-West Canton 138-kV line 
tripped due to sag into a tree, reclosed at West 
Canton only 

4P) 16:05:57 EDT: Sammis-Star 345-kV line 
tripped 

4Q) 16:06:02 EDT: Star-Urban 138-kV line tripped 

4R) 16:06:09 EDT: Richland-Ridgeville-Napo- 
leon-Stryker 138-kV line tripped on overload 
and locked out at all terminals 

4S) 16:08:58 EDT: Ohio Central-Wooster 138-kV 
line tripped 

Note: 16:08:55 EDT: East Wooster-South Can¬ 
ton 138-kV line tripped, but successful auto¬ 
matic reclosing restored this line. 


4A-H) Pleasant Valley to Chamberlin-West 
Akron Line Outages 

From 15:39 EDT to 15:58:47 EDT, seven 138-kV 
lines in northern Ohio tripped and locked out. At 
15:45:41 EDT, Canton Central-Tidd 345-kV line 
tripped and reclosed at 15:46:29 EDT because 
Canton Central 345/138-kV CB “Al” operated 
multiple times, causing a low air pressure problem 
that inhibited circuit breaker tripping. This event 
forced the Canton Central 345/138-kV transform¬ 
ers to disconnect and remain out of service, fur¬ 
ther weakening the Canton-Akron area 138-kV 
transmission system. At 15:58:47 EDT the 
Chamberlin-West Akron 138-kV line tripped. 

4I-N) West Akron Transformer Circuit Breaker 
Failure and Line Outages 

At 15:59 EDT FE’s West Akron 138-kV bus tripped 
due to a circuit breaker failure on West Akron 
transformer #1. This caused the five remaining 
138-kV lines connected to the West Akron substa¬ 
tion to open. The West Akron 138/12-kV trans¬ 
formers remained connected to the Barberton- 
West Akron 138-kV line, but power flow to West 
Akron 138/69-kV transformer #1 was interrupted. 

40-P) Dale-West Canton 138-kV and 
Sammis-Star 345-kV Lines Tripped 

After the Cloverdale-Torrey line failed at 15:42 
EDT, Dale-West Canton was the most heavily 
loaded line on FE’s system. It held on, although 
heavily overloaded to 160 and 180% of normal rat¬ 
ings, until tripping at 16:05:55 EDT. The loss of 
this line had a significant effect on the area, and 
voltages dropped significantly. More power 
shifted back to the remaining 345-kV network, 
pushing Sammis-Star’s loading above 120% of rat¬ 
ing. Two seconds later, at 16:05:57 EDT, Sammis- 
Star tripped out. Unlike the previous three 345-kV 
lines, which tripped on short circuits to ground 
due to tree contacts, Sammis-Star tripped because 
its protective relays saw low apparent impedance 
(depressed voltage divided by abnormally high 
line current)—i.e., the relay reacted as if the high 
flow was due to a short circuit. Although three 
more 138-kV lines dropped quickly in Ohio fol¬ 
lowing the Sammis-Star trip, loss of the Sammis- 
Star line marked the turning point at which sys¬ 
tem problems in northeast Ohio initiated a cascad¬ 
ing blackout across the northeast United States 
and Ontario. 

Losing the 138-kV Transmission Lines 

The tripping of 138-kV transmission lines that 
began at 15:39 EDT occurred because the loss 
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of the combination of the Har- 
ding-Chamberlin, Hanna-Juniper 
and Star-South Canton 345-kV 
lines overloaded the 138-kV sys¬ 
tem with electricity flowing north 
toward the Akron and Cleveland loads. Modeling 
indicates that the return of either the Hanna- 
Juniper or Chamberlin-Harding 345-kV lines 
would have diminished, but not alleviated, all of 
the 138-kV overloads. In theory, the return of both 
lines would have restored all the 138-kV lines to 
within their emergency ratings. 

However, all three 345-kV lines 
had already been compromised 
due to tree contacts so it is 
unlikely that FE would have suc¬ 
cessfully restored either line had 
they known it had tripped out, and since 
Star-South Canton had already tripped and 
reclosed three times it is also unlikely that an 
operator knowing this would have trusted it to 
operate securely under emergency conditions. 
While generation redispatch scenarios alone 
would not have solved the overload problem, 
modeling indicates that shedding load in the 
Cleveland and Akron areas may have reduced 
most line loadings to within emergency range and 
helped stabilize the system. However, the amount 
of load shedding required grew rapidly as FE’s sys¬ 
tem unraveled. 

Preventing the Blackout with Load-Shedding 

The investigation team examined 
whether load shedding before the 
loss of the Sammis-Star 345-kV 
line at 16:05:57 EDT could have 
prevented this line loss. The team 
found that 1,500 MW of load would have had to be 

Figure 5.15. Simulated Effect of Prior Outages on 
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dropped within the Cleveland-Akron area to 
restore voltage at the Star bus from 90.8% (at 120% 
of normal and emergency ampere rating) up to 
95.9% (at 101% of normal and emergency ampere 
rating ). 44 The P-V and V-Q analysis reviewed in 
Chapter 4 indicated that 95% is the minimum 
operating voltage appropriate for 345-kV buses in 
the Cleveland-Akron area. The investigation team 
concluded that since the Sammis-Star 345 kV out¬ 
age was the critical event leading to widespread 
cascading in Ohio and beyond, if manual or auto¬ 
matic load-shedding of 1,500 MW had occurred 
within the Cleveland-Akron 
area before that outage, the 
blackout could have been 
averted. 


Recommendations 


8, page 147; 21, page 158;l 
23, page 160 


Loss of the Sammis-Star 345-kV Line 

Figure 5.15, derived from investigation team mod¬ 
eling, shows how the power flows shifted across 
FE’s 345- and key 138-kV northeast Ohio lines as 
the line failures progressed. All lines were 
loaded within normal limits after the Har- 
ding-Chamberlin lock-out, but after the 
Hanna-Juniper trip at 15:32 EDT, the Star-South 
Canton 345-kV line and three 138-kV lines 
jumped above normal loadings. After Star-South 
Canton locked out at 15:41 EDT within its emer¬ 
gency rating, five 138-kV and the Sammis-Star 
345-kV lines were overloaded. From that point, as 
the graph shows, each subsequent line loss 
increased loadings on other lines, some loading to 
well over 150% of normal ratings before they 
failed. The Sammis-Star 345-kV line stayed in ser¬ 
vice until it tripped at 16:05:57 EDT. 


FirstEnergy had no automatic load-shedding 
schemes in place, and did not attempt to begin 
manual load-shedding. As Chapters 4 and 5 have 
established, once Sammis-Star tripped, the possi¬ 
bility of averting the coming cascade by shedding 
load ended. Within 6 minutes of these overloads, 
extremely low voltages, big power swings and 
accelerated line tripping would cause separations 
and blackout within the 
Eastern Interconnection. 


Recommendation 


21, page 158 


Endnotes 

1 Investigation team field visit to FE 10/8/2003: Steve 
Morgan. 

2 Investigation team field visit to FE, September 3, 2003, 
Hough interview: “When asked whether the voltages seemed 
unusual, he said that some sagging would be expected on a 
hot day, but on August 14th the voltages did seem unusually 
low.” Spidle interview: “The voltages for the day were not 
particularly bad.” 
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3 Manual of Operations, valid as of March 3, 2003, Process 
flowcharts: Voltage Control and Reactive Support - Plant and 
System Voltage Monitoring Under Normal Conditions. 

4 14:13:18. Channel 16 - Sammis 1. 13:15:49 / Channel 16 - 
West Lorain (FE Reliability Operator (RO) says, “Thanks. 
We’re starting to sag all over the system.”) / 13:16:44. Channel 
16 - Eastlake (talked to two operators) (RO says, “We got a 
way bigger load than we thought we would have.” And “.. .So 
we’re starting to sag all over the system.”) / 13:20:22. Channel 
16 - RO to “Berger” / 13:22:07. Channel 16 - “control room” 
RO says, “We’re sagging all over the system. I need some 
help.” / 13:23:24. Channel 16 - “Control room, Tom” / 
13:24:38. Channel 16 - “Unit 9” / 13:26:04. Channel 16 - 
“Dave” / 13:28:40. Channel 16 “Troy Control.” Also general 
note in RO Dispatch Log. 

5 Example at 13:33:40, Channel 3, FE transcripts. 

6 Investigation team field visit to MISO, Walsh and Seidu 
interviews. 

7 FE had and ran a state estimator every 30 minutes. This 
served as a base from which to perform contingency analyses. 
FE’s contingency analysis tool used SCADA and EMS inputs 
to identify any potential overloads that could result from vari¬ 
ous line or equipment outages. FE indicated that it has experi¬ 
enced problems with the automatic contingency analysis 
operation since the system was installed in 1995. As a result, 
FE operators or engineers ran contingency analysis manually 
rather than automatically, and were expected to do so when 
there were questions about the state of the system. Investiga¬ 
tion team interviews of FE personnel indicate that the contin¬ 
gency analysis model was likely running but not consulted at 
any point in the afternoon of August 14. 

8 After the Stuart-Atlanta line tripped, Dayton Power & Light 
did not immediately provide an update of a change in equip¬ 
ment availability using a standard form that posts the status 
change in the SDX (System Data Exchange, the NERC data¬ 
base which maintains real-time information on grid equip¬ 
ment status), which relays that notice to reliability 
coordinators and control areas. After its state estimator failed 
to solve properly, MISO checked the SDX to make sure that 
they had properly identified all available equipment and out¬ 
ages, but found no posting there regarding Stuart-Atlanta’s 
outage. 

9 Investigation team field visit, interviews with FE personnel 
on October 8-9, 2003. 

10 DOE Site Visit to First Energy, September 3, 2003, Inter¬ 
view with David M. Elliott. 

11 FE Report, “Investigation of FirstEnergy’s Energy Manage¬ 
ment System Status on August 14, 2003,” Bullet 1, Section 
4.2.11. 

12 Investigation team interviews with FE, October 8-9, 2003. 

13 Investigation team field visit to FE, October 8-9, 2003: team 
was advised that FE had discovered this effect during 
post-event investigation and testing of the EMS. FE’s report 
“Investigation of FirstEnergy’s Energy Management System 
Status on August 14, 2003” also indicates that this finding 
was “verified using the strip charts from 8-14-03” (page 23), 
not that the investigation of this item was instigated by opera¬ 
tor reports of such a failure. 

14 There is a conversation between a Phil and a Tom that 
speaks of “flatlining” 15:01:33. Channel 15. There is no men¬ 
tion of AGC or generation control in the DOE Site Visit inter¬ 
views with the reliability coordinator. 


15 FE Report, “Investigation of FirstEnergy’s Energy Manage¬ 
ment System Status on August 14, 2003.” 

16 Investigation team field visit to FE, October 8-9, 2003, 
Sanicky Interview: “From his experience, it is not unusual for 
alarms to fail. Often times, they may be slow to update or they 
may die completely. From his experience as a real-time opera¬ 
tor, the fact that the alarms failed did not surprise him.” Also 
from same document, Mike McDonald interview, “FE has pre¬ 
viously had [servers] down at the same time. The big issue for 
them was that they were not receiving new alarms.” 

17 A “cold” reboot of the XA21 system is one in which all 
nodes (computers, consoles, etc.) of the system are shut down 
and then restarted. Alternatively, a given XA21 node can be 
“warm” rebooted wherein only that node is shut down and 
restarted, or restarted from a shutdown state. A cold reboot 
will take significantly longer to perform than a warm one. 
Also during a cold reboot much more of the system is unavail¬ 
able for use by the control room operators for visibility or con¬ 
trol over the power system. Warm reboots are not uncommon, 
whereas cold reboots are rare. All reboots undertaken by FE’s 
IT EMSS support personnel on August 14 were warm reboots. 

18 The cold reboot was done in the early morning of 15 
August and corrected the alarm problem as hoped. 

19 Example at 14:19, Channel 14, FE transcripts. 

20 Example at 14:25, Channel 8, FE transcripts. 

21 Example at 14:32, Channel 15, FE transcripts. 

22 “Interim Report, Utility Vegetation Management,” 
U.S.-Canada Joint Outage Investigation Task Force, Vegeta¬ 
tion Management Program Review, October 2003, page 7. 

23 Investigation team transcript, meeting on September 9, 
2003, comments by Mr. Steve Morgan, Vice President Electric 
Operations: 

Mr. Morgan: The sustained outage history for these lines, 
2001, 2002, 2003, up until the event, Chamberlin-Harding 
had zero operations for those two-and-a-half years. And 
ITanna-Juniper had six operations in 2001, ranging from four 
minutes to maximum of 34 minutes. Two were unknown, one 
was lightning, one was a relay failure, and two were really 
relay scheme mis-operations. They’re category other. And 
typically, that—I don’t know what this is particular to opera¬ 
tions, that typically occurs when there is a mis-operation. 
Star-South Canton had no operations in that same period of 
time, two-and-a-half years. No sustained outages. And 
Sammis-Star, the line we haven’t talked about, also no sus¬ 
tained outages during that two-and-a-half year period. So is it 
normal? No. But 345 lines do operate, so it’s not unknown. 

24 “Utility Vegetation Management Final Report,” CN Utility 
Consulting, March 2004, page 32. 

25 “FE MISO Findings,” page 11. 

26 FE was conducting right-of-way vegetation maintenance 
on a 5-year cycle, and the tree crew at Hanna-Juniper was 
three spans away, clearing vegetation near the line, when the 
contact occurred on August 14. Investigation team 9/9/03 
meeting transcript, and investigation field team discussion 
with the tree-trimming crew foreman. 

27 Based on “FE MISO Findings” document, page 11. 

28 “Interim Report, Utility Vegetation Management,” 
US-Canada Joint Outage Task Force, Vegetation Management 
Program Review, October 2003, page 6. 

29 Investigation team September 9, 2003 meeting transcripts, 
Mr. Steve Morgan, First Energy Vice President, Electric Sys¬ 
tem Operations: 
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Mr. Benjamin: Steve, just to make sure that I’m understand¬ 
ing it correctly, you had indicated that once after 
Hanna-Juniper relayed out, there wasn’t really a problem with 
voltage on the system until Star-S. Canton operated. But were 
the system operators aware that when Hanna-Juniper was 
out, that if Star-S. Canton did trip, they would be outside of 
operating limits? 

Mr. Morgan: I think the answer to that question would have 
required a contingency analysis to be done probably on 
demand for that operation. It doesn’t appear to me that a con¬ 
tingency analysis, and certainly not a demand contingency 
analysis, could have been run in that period of time. Other 
than experience, I don’t know that they would have been able 
to answer that question. And what I know of the record right 
now is that it doesn’t appear that they ran contingency analy¬ 
sis on demand. 

Mr. Benjamin: Could they have done that? 

Mr. Morgan: Yeah, presumably they could have. 

Mr. Benjamin: You have all the tools to do that? 

Mr. Morgan: They have all the tools and all the information is 
there. And if the State Estimator is successful in solving, and 
all the data is updated, yeah, they could have. I would say in 
addition to those tools, they also have access to the planning 
load flow model that can actually run the same—full load of 
the model if they want to. 

30 Example synchronized at 14:32 (from 13:32) #18 041 
TDC-E2 283.wav, AEP transcripts. 

31 Example synchronized at 14:19 #2 020 TDC-El 266.wav, 
AEP transcripts. 

32 Example at 15:36 Channel 8, FE transcripts. 

33 Example at 15:41:30 Channel 3, FE transcripts. 


34 Example synchronized at 15:36 (from 14:43) Channel 20, 
MISO transcripts. 

35 Example at 15:42:49, Channel 8, FE transcripts. 

36 Example at 15:46:00, Channel 8 FE transcripts. 

37 Example at 15:45:18, Channel 4, FE transcripts. 

38 Example at 15:46:00, Channel 8 FE transcripts. 

39 Example at 15:50:15, Channel 12 FE transcripts. 

40 Example synchronized at 15:48 (from 14:55), channel 22, 
MISO transcripts. 

41 Example at 15:56:00, Channel 31, FE transcripts. 

42 FE Transcripts 15:45:18 on Channel 4 and 15:56:49 on 
Channel 31. 

43 The operator logs from FE’s Ohio control center indicate 
that the west desk operator knew of the alarm system failure 
at 14:14, but that the east desk operator first knew of this 
development at 15:45. These entries may have been entered 
after the times noted, however. 

44 The investigation team determined that FE was using a dif¬ 
ferent set of line ratings for Sammis-Star than those being 
used in the MISO and PJM reliability coordinator calculations 
or by its neighbor AEP. Specifically, FE was operating 
Sammis-Star assuming that the 345-kV line was rated for 
summer normal use at 1,310 MVA, with a summer emergency 
limit rating of 1,310 MVA. In contrast, MISO, PJM and AEP 
were using a more conservative rating of 950 MVA normal 
and 1,076 MVA emergency for this line. The facility owner (in 
this case FE) is the entity which provides the line rating; when 
and why the ratings were changed and not communicated to 
all concerned parties has not been determined. 
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6. The Cascade Stage of the Blackout 


Chapter 5 described how uncorrected problems in 
northern Ohio developed to 16:05:57 EDT, the last 
point at which a cascade of line trips could have 
been averted. However, the Task Force’s investiga¬ 
tion also sought to understand how and why the 
cascade spread and stopped as it did. As detailed 
below, the investigation determined the sequence 
of events in the cascade, and how and why it 
spread, and how it stopped in each general geo¬ 
graphic area. 

Based on the investigation to date, the investiga¬ 
tion team concludes that the cascade spread 
beyond Ohio and caused such a widespread black¬ 
out for three principal reasons. First, the loss of the 
Sammis-Star 345-kV line in Ohio, following the 
loss of other transmission lines and weak voltages 
within Ohio, triggered many subsequent line trips. 
Second, many of the key lines which tripped 
between 16:05:57 and 16:10:38 EDT operated on 
zone 3 impedance relays (or zone 2 relays set to 
operate like zone 3s) which responded to over¬ 
loads rather than true faults on the grid. The speed 
at which they tripped spread the reach and accel¬ 
erated the spread of the cascade beyond the Cleve- 
land-Akron area. Third, the evidence collected 
indicates that the relay protection settings for the 
transmission lines, generators and under-fre¬ 
quency load-shedding in the northeast may not be 
entirely appropriate and are certainly not coordi¬ 
nated and integrated to reduce the likelihood and 
consequences of a cascade—nor were they 
intended to do so. These issues are discussed in 
depth below. 

This analysis is based on close examination of the 
events in the cascade, supplemented by complex, 
detailed mathematical modeling of the electrical 
phenomena that occurred. At the completion of 
this report, the modeling had progressed through 
16:10:40 EDT, and was continuing. Thus this 
chapter is informed and validated by modeling 
(explained below) up until that time. Explanations 
after that time reflect the investigation team’s best 
hypotheses given the available data, and may be 
confirmed or modified when the modeling is com¬ 
plete. However, simulation of these events is so 


complex that it may be impossible to ever com¬ 
pletely prove these or other theories about the 
fast-moving events of August 14. Final modeling 
results will be published by NERC as a technical 
report in several months. 

Why Does a Blackout Cascade? 

Major blackouts are rare, and no two blackout sce¬ 
narios are the same. The initiating events will 
vary, including human actions or inactions, sys¬ 
tem topology, and load/generation balances. Other 
factors that will vary include the distance between 
generating stations and major load centers, voltage 
profiles across the grid, and the types and settings 
of protective relays in use. 

Some wide-area blackouts start with short circuits 
(faults) on several transmission lines in short suc¬ 
cession—sometimes resulting from natural causes 
such as lightning or wind or, as on August 14, 
resulting from inadequate tree management in 
right-of-way areas. A fault causes a high current 
and low voltage on the line containing the fault. A 
protective relay for that line detects the high cur¬ 
rent and low voltage and quickly trips the circuit 
breakers to isolate that line from the rest of the 
power system. 

A cascade is a dynamic phenomenon that cannot 
be stopped by human intervention once started. It 
occurs when there is a sequential tripping of 
numerous transmission lines and generators in a 
widening geographic area. A cascade can be trig¬ 
gered by just a few initiating events, as was seen 
on August 14. Power swings and voltage fluctua¬ 
tions caused by these initial events can cause 
other lines to detect high currents and low volt¬ 
ages that appear to be faults, even if faults do not 
actually exist on those other lines. Generators are 
tripped off during a cascade to protect them from 
severe power and voltage swings. Protective relay 
systems work well to protect lines and generators 
from damage and to isolate them from the system 
under normal and abnormal system conditions. 

But when power system operating and design cri¬ 
teria are violated because several outages occur 
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simultaneously, commonly used protective relays 
that measure low voltage and high current cannot 
distinguish between the currents and voltages 
seen in a system cascade from those caused by a 
fault. This leads to more and more lines and gener¬ 
ators being tripped, widening the blackout area. 

How Did the Cascade Evolve on 
August 14? 

A series of line outages in northeast Ohio starting 
at 15:05 EDT caused heavy loadings on parallel 
circuits, leading to the trip and lock-out of FE’s 
Sammis-Star 345-kV line at 16:05:57 Eastern Day¬ 
light Time. This was the event that triggered a cas¬ 
cade of interruptions on the high voltage system, 
causing electrical fluctuations and facility trips 
such that within seven minutes the blackout rip¬ 
pled from the Cleveland-Akron area across much 
of the northeast United States and Canada. By 
16:13 EDT, more than 508 generating units at 265 
power plants had been lost, and tens of millions of 
people in the United States and Canada were with¬ 
out electric power. 

The events in the cascade started relatively 
slowly. Figure 6.1 illustrates how the number of 
lines and generation lost stayed relatively low dur¬ 
ing the Ohio phase of the blackout, but then 
picked up speed after 16:08:59 EDT. The cascade 
was complete only three minutes later. 


Chapter 5 described the four phases that led to the 
initiation of the cascade at about 16:06 EDT. After 
16:06 EDT, the cascade evolved in three distinct 
phases: 

♦ Phase 5. The collapse of FE’s transmission sys¬ 
tem induced unplanned shifts of power across 
the region. Shortly before the collapse, large 
(but normal) electricity flows were moving 
across FE’s system from generators in the south 
(Tennessee and Kentucky) and west (Illinois 
and Missouri) to load centers in northern Ohio, 
eastern Michigan, and Ontario. A series of lines 
within northern Ohio tripped under the high 

Figure 6.1. Rate of Line and Generator Trips During 
the Cascade 



Impedance Relays 

The most common protective device for trans¬ 
mission lines is the impedance (Z) relay (also 
known as a distance relay). It detects changes in 
currents (/) and voltages (V) to determine the 
apparent impedance ( Z=V/I) of the line. A relay 
is installed at each end of a transmission line. 
Each relay is actually three relays within one, 
with each element looking at a particular “zone” 
or length of the line being protected. 

♦ The first zone looks for faults over 80% of the 
line next to the relay, with no time delay before 
the trip. 

♦ The second zone is set to look at the entire line 
and slightly beyond the end of the line with a 
slight time delay. The slight delay on the zone 
2 relay is useful when a fault occurs near one 
end of the line. The zone 1 relay near that end 
operates quickly to trip the circuit breakers on 
that end. However, the zone 1 relay on the 
other end may not be able to tell if the fault is 


just inside the line or just beyond the line. In 
this case, the zone 2 relay on the far end trips 
the breakers after a short delay, after the zone 1 
relay near the fault opens the line on that end 
first. 

♦ The third zone is slower acting and looks for 
line faults and faults well beyond the length of 
the line. It can be thought of as a remote relay 
or breaker backup, but should not trip the 
breakers under typical emergency conditions. 

An impedance relay operates when the apparent 
impedance, as measured by the current and volt¬ 
age seen by the relay, falls within any one of the 
operating zones for the appropriate amount of 
time for that zone. The relay will trip and cause 
circuit breakers to operate and isolate the line. 
All three relay zone operations protect lines from 
faults and may trip from apparent faults caused 
by large swings in voltages and currents. 
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loads, hastened by the impact of Zone 3 imped¬ 
ance relays. This caused a series of shifts in 
power flows and loadings, but the grid stabi¬ 
lized after each. 

♦ Phase 6. After 16:10:36 EDT, the power surges 
resulting from the FE system failures caused 
lines in neighboring areas to see overloads that 
caused impedance relays to operate. The result 
was a wave of line trips through western Ohio 
that separated AEP from FE. Then the line trips 
progressed northward into Michigan separating 
western and eastern Michigan, causing a power 
flow reversal within Michigan toward Cleve¬ 
land. Many of these line trips were from Zone 3 
impedance relay actions that accelerated the 
speed of the line trips and reduced the potential 
time in which grid operators might have identi¬ 
fied the growing problem and acted construc¬ 
tively to contain it. 

With paths cut from the west, a massive power 
surge flowed from PJM into New York and 
Ontario in a counter-clockwise flow around 
Lake Erie to serve the load still connected in 
eastern Michigan and northern Ohio. Relays on 
the lines between PJM and New York saw this 
massive power surge as faults and tripped those 
lines. Ontario’s east-west tie line also became 
overloaded and tripped, leaving northwest 
Ontario connected to Manitoba and Minnesota. 
The entire northeastern United States and east¬ 
ern Ontario then became a large electrical 
island separated from the rest of the Eastern 
Interconnection. This large area, which had 
been importing power prior to the cascade, 
quickly became unstable after 16:10:38 as there 
was not sufficient generation on-line within the 
island to meet electricity demand. Systems to 
the south and west of the split, such as PJM, 
AEP and others further away, remained intact 
and were mostly unaffected by the outage. Once 
the northeast split from the rest of the Eastern 
Interconnection, the cascade was isolated. 

♦ Phase 7. In the final phase, after 16:10:46 EDT, 
the large electrical island in the northeast had 
less generation than load, and was unstable 
with large power surges and swings in fre¬ 
quency and voltage. As a result, many lines and 
generators across the disturbance area tripped, 
breaking the area into several electrical islands. 
Generation and load within these smaller 
islands was often unbalanced, leading to fur¬ 
ther tripping of lines and generating units until 
equilibrium was established in each island. 


Although much of the disturbance area was 
fully blacked out in this process, some islands 
were able to reach equilibrium without total 
loss of service. For example, the island consist¬ 
ing of most of New England and the Maritime 
Provinces stabilized and generation and load 
returned to balance. Another island consisted of 
load in western New York and a small portion of 
Ontario, supported by some New York genera¬ 
tion, the large Beck and Saunders plants in 
Ontario, and the 765-kV interconnection to 
Quebec. This island survived but some other 
areas with large load centers within the island 
collapsed into a blackout condition (Figure 6.2). 

What Stopped the August 14 Blackout 
from Cascading Further? 

The investigation concluded that a combination of 
the following factors determined where and when 
the cascade stopped spreading: 

♦ The effects of a disturbance travel over power 
lines and become damped the further they are 
from the initial point, much like the ripple from 
a stone thrown in a pond. Thus, the voltage and 
current swings seen by relays on lines farther 
away from the initial disturbance are not as 
severe, and at some point they are no longer suf¬ 
ficient to cause lines to trip. 

♦ Higher voltage lines and more densely net¬ 
worked lines, such as the 500-kV system in PJM 
and the 765-kV system in AEP, are better able to 
absorb voltage and current swings and thus 
serve as a barrier to the spread of a cascade. As 
seen in Phase 6, the cascade progressed into 
western Ohio and then northward through 
Michigan through the areas that had the fewest 
transmission lines. Because there were fewer 


Figure 6.2. Area Affected by the Blackout 
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System Oscillations, Stable, Transient, and Dynamic Conditions 


The electric power system constantly experi¬ 
ences small power oscillations that do not lead to 
system instability. They occur as generator rotors 
accelerate or slow down while rebalancing elec¬ 
trical output power to mechanical input power, 
to respond to changes in load or network condi¬ 
tions. These oscillations are observable in the 
power flow on transmission lines that link gener¬ 
ation to load or in the tie lines that link different 
regions of the system together. But with a distur¬ 
bance to the network, the oscillations can 
become more severe, even to the point where 
flows become progressively so great that protec¬ 
tive relays trip the connecting lines. If the lines 
connecting different electrical regions separate, 
each region will find its own frequency, depend¬ 
ing on the load to generation balance at the time 
of separation. 

Oscillations that grow in amplitude are called 
unstable oscillations. Such oscillations, once ini¬ 
tiated, cause power to flow back and forth across 
the system like water sloshing in a rocking tub. 

In a stable electric system, if a disturbance such 
as a fault occurs, the system will readjust and 
rebalance within a few seconds after the fault 
clears. If a fault occurs, protective relays can trip 
in less than 0.1 second. If the system recovers 
and rebalances within less than 3 seconds, with 
the possible loss of only the faulted element and 
a few generators in the area around the fault, then 
that condition is termed “transiently stable.” If 
the system takes from 3 to 30 seconds to recover 
and stabilize, it is “dynamically stable.” But in 


rare cases when a disturbance occurs, the system 
may appear to rebalance quickly, but it then 
over-shoots and the oscillations can grow, caus¬ 
ing widespread instability that spreads in terms 
of both the magnitude of the oscillations and in 
geographic scope. This can occur in a system that 
is heavily loaded, causing the electrical distance 
(apparent impedance) between generators to be 
longer, making it more difficult to keep the 
machine angles and speeds synchronized. In a 
system that is well damped, the oscillations will 
settle out quickly and return to a steady balance. 
If the oscillation continues over time, neither 
growing nor subsiding, it is a poorly damped 
system. 

The illustration below, of a weight hung on a 
spring balance, illustrates a system which oscil¬ 
lates over several cycles to return to balance. A 
critical point to observe is that in the process of 
hunting for its balance point, the spring over¬ 
shoots the true weight and balance point of the 
spring and weight combined, and must cycle 
through a series of exaggerated overshoots and 
underweight rebounds before settling down to 
rest at its true balance point. The same process 
occurs on an electric system, as can be observed 
in this chapter. 

If a system is in transient instability, the oscilla¬ 
tions following a disturbance will grow in magni¬ 
tude rather than settle out, and it will be unable 
to readjust to a stable, steady state. This is what 
happened to the area that blacked out on August 
14, 2003. 
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lines, each line absorbed more of the power and 
voltage surges and was more vulnerable to trip¬ 
ping. A similar effect was seen toward the east 
as the lines between New York and Pennsylva¬ 
nia, and eventually northern New Jersey trip¬ 
ped. The cascade of transmission line outages 
became contained after the northeast United 
States and Ontario were completely separated 
from the rest of the Eastern Interconnection and 
no more power flows were possible into the 
northeast (except the DC ties from Quebec, 
which continued to supply power to western 
New York and New England). 

♦ Line trips isolated some areas from the portion 
of the grid that was experiencing instability. 
Many of these areas retained sufficient on-line 
generation or the capacity to import power from 
other parts of the grid, unaffected by the surges 
or instability, to meet demand. As the cascade 
progressed, and more generators and lines trip¬ 
ped off to protect themselves from severe dam¬ 
age, some areas completely separated from the 
unstable part of the Eastern Interconnection. In 
many of these areas there was sufficient genera¬ 
tion to match load and stabilize the system. 
After the large island was formed in the north¬ 
east, symptoms of frequency and voltage decay 
emerged. In some parts of the northeast, the sys¬ 
tem became too unstable and shut itself down. 
In other parts, there was sufficient generation, 
coupled with fast-acting automatic load shed¬ 
ding, to stabilize frequency and voltage. In this 
manner, most of New England and the Maritime 
Provinces remained energized. Approximately 
half of the generation and load remained on in 
western New York, aided by generation in 
southern Ontario that split and stayed with 
western New York. There were other smaller 
isolated pockets of load and generation that 
were able to achieve equilibrium and remain 
energized. 

Phase 5: 

345-kV Transmission System 
Cascade in Northern Ohio and 
South-Central Michigan 

Overview of This Phase 

After the loss of FE’s Sammis-Star 345-kV line and 
the underlying 138-kV system, there were no 
large capacity transmission lines left from the 
south to support the significant amount of load in 
northern Ohio (Figure 6.3). This overloaded the 


transmission paths west and northwest into Mich¬ 
igan, causing a sequential loss of lines and power 
plants. 

Key Events in This Phase 

5A) 16:05:57 EDT: Sammis-Star 345-kV tripped 
by zone 3 relay. 

5B) 16:08:59 EDT: Galion-Ohio Central-Mus- 
kingum 345-kV line tripped on zone 3 relay. 
5C) 16:09:06 EDT: East Lima-Fostoria Central 
345-kV line tripped on zone 3 relay, causing 
major power swings through New York and 
Ontario into Michigan. 

5D) 16:09:08 EDT to 16:10:27 EDT: Several power 
plants lost, totaling 937 MW. 

5A) Sammis-Star 345-kV Tripped: 16:05:57 EDT 

Sammis-Star did not trip due to a short circuit to 
ground (as did the prior 345-kV lines that tripped). 
Sammis-Star tripped due to protective zone 3 
relay action that measured low apparent imped¬ 
ance (depressed voltage divided by abnormally 
high line current) (Figure 6.4). There was no fault 
and no major power swing at the time of the 
trip—rather, high flows above the line’s emer¬ 
gency rating together with depressed voltages 
caused the overload to appear to the protective 
relays as a remote fault on the system. In effect, the 
relay could no longer differentiate between a 
remote three-phase fault and an exceptionally 
high line-load condition. Moreover, the reactive 
flows (VAr) on the line were almost ten times 
higher than they had been earlier in the day 
because of the current overload. The relay oper¬ 
ated as it was designed to do. 


Figure 6.3. Sammis-Star 345-kV Line Trip, 
16:05:57 EDT 
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The Sammis-Star 345-kV line trip completely sev¬ 
ered the 345-kV path into northern Ohio from 
southeast Ohio, triggering a new, fast-paced 
sequence of 345-kV transmission line trips in 
which each line trip placed a greater flow burden 
on those lines remaining in service. These line 
outages left only three paths for power to flow into 
western Ohio: (1) from northwest Pennsylvania to 
northern Ohio around the south shore of Lake 
Erie, (2) from southwest Ohio toward northeast 
Ohio, and (3) from eastern Michigan and Ontario. 
The line interruptions substantially weakened 
northeast Ohio as a source of power to eastern 
Michigan, making the Detroit area more reliant on 
345-kV lines west and northwest of Detroit, and 
from northwestern Ohio to eastern Michigan. The 
impact of this trip was felt across the grid—it 
caused a 100 MW increase in flow from PJM into 
New York and through to Ontario. 1 Frequency in 
the Eastern Interconnection increased momen¬ 
tarily by 0.02 Hz. 

Soon after the Sammis-Star trip, four of the five 48 
MW Handsome Lake combustion turbines in 
western Pennsylvania tripped off-line. These 
units are connected to the 345-kV system by the 
Homer City-Wayne 345-kV line, and were operat¬ 
ing that day as synchronous condensers to partici¬ 
pate in PJM’s spinning reserve market (not to 
provide voltage support). When Sammis-Star trip¬ 
ped and increased loadings on the local transmis¬ 
sion system, the Handsome Lake units were close 
enough electrically to sense the impact and trip¬ 
ped off-line at 16:07:00 EDT on under-voltage. 

During the period between the Sammis-Star trip 
and the trip of East Lima-Fostoria at 16:09:06.3 
EDT, the system was still in a steady-state condi¬ 
tion. Although one line after another was 


Figure 6.4. Sammis-Star 345-kV Line Trip 
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overloading and tripping within Ohio, this was 
happening slowly enough under relatively stable 
conditions that the system could readjust—after 
each line loss, power flows would redistribute 
across the remaining lines. This is illustrated in 
Figure 6.5, which shows the MW flows on the 
Michigan Electrical Coordinated Systems (MECS) 
interfaces with AEP (Ohio), FirstEnergy (Ohio) 
and Ontario. The graph shows a shift from 150 
MW imports to 200 MW exports from the MECS 
system into FirstEnergy at 16:05:57 EDT after the 
loss of Sammis-Star, after which this held steady 
until 16:08:59, when the loss of East Lima-Fostoria 
Central cut the main energy path from the south 
and west into Cleveland and Toledo. Loss of this 
path was significant, causing flow from MECS into 
FE to jump from 200 MW up to 2,300 MW, where 
it bounced somewhat before stabilizing, roughly, 
until the path across Michigan was cut at 16:10:38 
EDT. 

Transmission Lines into Northwestern Ohio 
Tripped, and Generation Tripped in South 
Central Michigan and Northern Ohio: 16:08:59 
EDT to 16:10:27 EDT 

5B) 16:08:59 EDT: Galion-Ohio Central-Mus- 
kingum 345-kV line tripped 

5C) 16:09:06 EDT: East Lima-Fostoria Central 
345-kV line tripped, causing a large power 
swing from Pennsylvania and New York 
through Ontario to Michigan 

The tripping of the Galion-Ohio Central- 
Muskingum and East Lima-Fostoria Central 


Figure 6.5. Line Flows Into Michigan 



Energy Management System, which records flow quantities 
every 2 seconds. As a result, the fast power swings that 
occurred between 16:10:36 to 16:13 were not captured by the 
recorders and are not reflected in these curves. 
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345-kV transmission lines removed the transmis¬ 
sion paths from southern and western Ohio into 
northern Ohio and eastern Michigan. Northern 
Ohio was connected to eastern Michigan by only 
three 345-kV transmission lines near the south¬ 
western bend of Lake Erie. Thus, the combined 
northern Ohio and eastern Michigan load centers 
were left connected to the rest of the grid only by: 
(1) transmission lines eastward from northeast 
Ohio to northwest Pennsylvania along the south¬ 
ern shore of Lake Erie, and (2) westward by lines 
west and northwest of Detroit, Michigan and from 
Michigan into Ontario (Figure 6.6). 


The Galion-Ohio Central-Muskingum 345-kV line 
tripped first at Muskingum at 16:08:58.5 EDT on a 
phase-to-ground fault, reclosed and tripped again 
at 16:08:58.6 at Ohio Central, reclosed and tripped 
again at Muskingum on a Zone 3 relay, and finally 
tripped at Gabon on a ground fault. 


After the Galion-Ohio Central-Muskingum line 
outage and numerous 138-kV line trips in central 
Ohio, the East Lima-Fostoria Central 345-kV line 
tripped at 16:09:06 EDT on Zone 3 relay operation 
due to high current and extremely low voltage 
(80%). Investigation team modeling indicates that 
if automatic under-voltage load-shedding had 
been in place in northeast Ohio, it might have 
been triggered at or before this point, and dropped 
enough load to reduce or 
eliminate the subsequent 


line overloads 
the cascade. 


that spread 


Recommendation s 


8, page 147; 21, page 158 I 


Figure 6.7, a high-speed recording of 345-kV flows 
past Niagara Falls from the Hydro One recorders, 


Figure 6.6. Ohio 345-kV Lines Trip, 16:08:59 to 
16:09:07 EDT 



shows the impact of the East Lima-Fostoria Cen¬ 
tral and the New York to Ontario power swing, 
which continued to oscillate for over 10 seconds. 
Looking at the MW flow line, it is clear that when 
Sammis-Star tripped, the system experienced 
oscillations that quickly damped out and 
rebalanced. But East Lima-Fostoria triggered sig¬ 
nificantly greater oscillations that worsened in 
magnitude for several cycles, and returned to sta¬ 
bility but continued to flutter until the 
Argenta-Battle Creek trip 90 seconds later. Volt¬ 
ages also began declining at this time. 

After the East Lima-Fostoria Central trip, power 
flows increased dramatically and quickly on the 
lines into and across southern Michigan. 
Although power had initially been flowing north¬ 
east out of Michigan into Ontario, that flow sud¬ 
denly reversed and approximately 500 to 700 MW 
of power (measured at the Michigan-Ontario bor¬ 
der, and 437 MW at the Ontario-New York border 
at Niagara) flowed southwest out of Ontario 
through Michigan to serve the load of Cleveland 
and Toledo. This flow was fed by 700 MW pulled 
out of PJM through New York on its 345-kV net¬ 
work. 2 This was the first of several inter-area 
power and frequency events that occurred over 
the next two minutes. This was the system’s 
response to the loss of the northwest Ohio trans¬ 
mission paths (above), and the stress that the 
still-high Cleveland, Toledo, and Detroit loads put 
onto the surviving lines and local generators. 

Figure 6.7 also shows the magnitude of subse¬ 
quent flows and voltages at the New York-Ontario 
Niagara border, triggered by the trips of the 
Argenta-Battle Creek, Argenta-Tompkins, Hamp- 
ton-Pontiac and Thetford-Jewell 345-kV lines in 
Michigan, and the Erie West-Ashtabula-Perry 


Figure 6.7. New York-Ontario Line Flows at Niagara 
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345-kV line linking the Cleveland area to Pennsyl¬ 
vania. Farther south, the very low voltages on the 
northern Ohio transmission system made it very 
difficult for the generation in the Cleveland and 
Lake Erie area to maintain synchronism with the 
Eastern Interconnection. Over the next two min¬ 
utes, generators in this area shut down after reach¬ 
ing a point of no recovery as the stress level across 
the remaining ties became excessive. 

Figure 6.8, of metered power flows along the New 
York interfaces, documents how the flows head¬ 
ing north and west toward Detroit and Cleveland 
varied at different points on the grid. Beginning at 
16:09:05 EDT, power flows jumped simulta¬ 
neously across all three interfaces—but when the 
first power surge peaked at 16:09:09, the change in 
flow was highest on the PJM interface and lowest 
on the New England interface. Power flows 
increased significantly on the PJM-NY and NY- 
Ontario interfaces because of the redistribution of 
flow around Lake Erie. The New England and Mar¬ 
itime systems maintained the same generation to 
load balance and did not carry the redistributed 
flows because they were not in the direct path of 
the flows, so that interface with New York showed 
little response. 

Before this first major power swing on the Michi¬ 
gan/Ontario interface, power flows in the NPCC 
Region (Quebec, Ontario and the Maritimes, New 
England and New York) were typical for the sum¬ 
mer period, and well within acceptable limits. 
Transmission and generation facilities were then 
in a secure state across the NPCC region. 

Zone 3 Relays and the Start of the Cascade 

Zone 3 relays are set to provide breaker failure and 
relay backup for remote distance faults on a trans¬ 
mission line. If it senses a fault past the immediate 


Figure 6.8. First Power Swing Has Varying Impacts 
Across the Grid 



reach of the line and its zone 1 and zone 2 settings, 
a zone 3 relay waits through a 1 to 2 second time 
delay to allow the primary line protection to act 
first. A few lines have zone 3 settings designed 
with overload margins close to the long-term 
emergency limit of the line, because the length 
and configuration of the line dictate a higher 
apparent impedance setting. Thus it is possible for 
a zone 3 relay to operate on line load or overload in 
extreme contingency conditions even in the 
absence of a fault (which is why many regions in 
the United States and Canada have eliminated the 
use of zone 3 relays on 230-kV and greater lines). 
Some transmission operators set zone 2 relays to 
serve the same purpose as zone 3s—i.e., to reach 
well beyond the length of the line it is protecting 
and protect against a distant fault on the outer 
lines. 

The Sammis-Star line tripped at 16:05:57 EDT on 
a zone 3 impedance relay although there were no 
faults occurring at the time, because increased real 
and reactive power flow caused the apparent 
impedance to be within the impedance circle 
(reach) of the relay. Between 16:06:01 and 
16:10:38.6 EDT, thirteen more important 345 and 
138-kV lines tripped on zone 3 operations that 
afternoon at the start of the cascade, including 
Galion-Ohio Central-Muskingum, East Lima- 
Fostoria Central, Argenta-Battle Creek, Argenta- 
Tompkins, Battle Creek-Oneida, and Perry- 
Ashtabula (Figure 6.9). These included several 
zone 2 relays in Michigan that had been set to 
operate like zone 3s, overreaching the line by more 
than 200% with no intentional time delay for 
remote breaker failure protection. 3 All of these 
relays operated according to their settings. How¬ 
ever, the zone 3 relays (and zone 2 relays acting 
like zone 3 s) acted so quickly that they impeded 
the natural ability of the electric system to hold 
together, and did not allow for any operator inter¬ 
vention to attempt to stop the spread of the cas¬ 
cade. The investigation team concluded that 
because these zone 2 and 3 relays tripped after 
each line overloaded, these relays were the com¬ 
mon mode of failure that accelerated the geo¬ 
graphic spread of the cascade. Given grid 
conditions and loads and the limited operator 
tools available, the speed of the zone 2 and 3 oper¬ 
ations across Ohio and Michigan eliminated any 
possibility after 16:05:57 EDT that either operator 
action or automatic intervention could have lim¬ 
ited or mitigated the growing cascade. 

What might have happened on August 14 if these 
lines had not tripped on zone 2 and 3 relays? Each 
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Figure 6.9. Map of Zone 3 (and Zone 2s Operating Like Zone 3s) Relay Operations on August 14, 2003 
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Zone 3 Tripped Lines 

State 

Trip Time (EDT) 

1 

Sammis - Star (345 kV) 

Ohio 

16:05:57.504 

2 

Star (138/69 kV) Transformer # 6 

Star - Dale (69kV) 

Ohio 

16:06:01 

3 

Ohio CTL - Wooster (138 kV) 

Ohio 

16:08:58 

4 

Gabon - Ohio CTL - Muskingum (345 kV) 

@ Muskingum 

Ohio 

16:08:59.158 

5 

Richland - Wauseon - Midway (138 kV) 

Ohio 

16:09:00 

6 

Academia - Howard (138 kV) 

Philo - Howard (138 kV) 

Ohio 

16:09:05 

7 

Tangy - Kirby (138 kV) 

Tangy - Crissinger (138 kV) 

Ohio 

16:09:06 

8 

E. Lima - Fostoria (345 kV) 

Ohio 

16:09:06.311 

9 

Argenta - Battle Ck (345 kV) 

Michigan 

16:10:36.230 

10 

Argenta - Tompkins (345 kV) 

Battle Ck - Oneida 

Michigan 

16:10:36.310 

11 

Argenta - Verona (138 kV) 

Michigan 

16:10:37.550 

12 

Delhi - Island Road (138 kV) 

Michigan 

16:10:37.870 

13 

Verona - Batavia (138 kV) 

Michigan 

16:10:37.900 

14 

Argenta - Morrow (138 kV) 

Michigan 

16:10:38.350 


Voltage Collapse 

Although the blackout of August 14 has been 
labeled by some as a voltage collapse, it was not a 
voltage collapse as that term has been tradition¬ 
ally used by power system engineers. Voltage 
collapse occurs when an increase in load or loss 
of generation or transmission facilities causes 
dropping voltage, which causes a further reduc¬ 
tion in reactive power from capacitors and line 
charging, and still further voltage reductions. If 
the declines continue, these voltage reductions 
cause additional elements to trip, leading to fur¬ 
ther reduction in voltage and loss of load. The 
result is a progressive and uncontrollable decline 
in voltage, all because the power system is 
unable to provide the reactive power required to 
supply the reactive power demand. This did not 
occur on August 14. While the Cleveland-Akron 
area was short of reactive power reserves they 
were just sufficient to supply the reactive power 
demand in the area and maintain stable albeit 
depressed voltages for the outage conditions 
experienced. 

But the lines in the Cleveland-Akron area tripped 
as a result of tree contacts well below the nomi¬ 
nal rating of the lines and not due to low volt¬ 
ages, which is a precursor for voltage collapse. 
The initial trips within FirstEnergy began 
because of ground faults with untrimmed 
trees, not because of a shortage of reactive power 
and low voltages. Voltage levels were within 


workable bounds before individual transmission 
trips began, and those trips occurred within nor¬ 
mal line ratings rather than in overloads. With 
fewer lines operational, current flowing over the 
remaining lines increased and voltage decreased 
(current increases in inverse proportion to the 
decrease in voltage for a given amount of power 
flow)—but it stabilized after each line trip until 
the next circuit trip. Soon northern Ohio lines 
began to trip out automatically on protection 
from overloads, not from insufficient reactive 
power. Once several lines tripped in the Cleve¬ 
land-Akron area, the power flow was rerouted to 
other heavily loaded lines in northern Ohio, 
causing depressed voltages which led to auto¬ 
matic tripping on protection from overloads. 
Voltage collapse therefore was not a cause of the 
cascade. 

As the cascade progressed beyond Ohio, it spread 
due not to insufficient reactive power and a volt¬ 
age collapse, but because of dynamic power 
swings and the resulting system instability. 
Figure 6.7 shows voltage levels recorded at the 
Niagara area. It shows clearly that voltage levels 
remained stable until 16:10:30 EDT, despite sig¬ 
nificant power fluctuations. In the cascade that 
followed, the voltage instability was a compan¬ 
ion to, not a driver of, the angle instability that 
tripped generators and lines. 
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was operating with high load, and loads on each 
line grew as each preceding line tripped out of ser¬ 
vice. But if these lines had not tripped quickly on 
zone 2s and 3s, each might have remained heavily 
loaded, with conductor temperatures increasing, 
for as long as 20 to 30 minutes before the line 
sagged into something and experienced a ground 
fault. For instance, the Dale-West Canton line took 
20 minutes to trip under 160 to 180% of its normal 
rated load. Even with sophisticated modeling it is 
impossible to predict just how long this delay 
might have occurred (affected by wind speeds, 
line loadings, and line length, tension and ground 
clearance along every span), because the system 
did not become dynamically unstable until at least 
after the Thetford-Jewell trip at 16:10:38 EDT. 
During this period the system would likely have 
remained stable and been able to readjust after 
each line trip on ground fault. If this period of 
deterioration and overloading under stable condi¬ 
tions had lasted for as little as 15 minutes or as 
long as an hour, it is possible that the growing 
problems could have been recognized and action 
taken, such as automatic under-voltage load¬ 
shedding, manual load-shedding in Ohio or other 
measures. So although the operation of zone 2 and 
3 relays in Ohio and Michigan did not cause the 
blackout, it is certain that 
they greatly expanded and 
accelerated the spread of 
the cascade. 


Recommendation 


21, page 158 


5D) Multiple Power Plants Tripped, Totaling 
946 MW: 16:09:08 to 16:10:27 EDT 

16:09:08 EDT: Michigan Cogeneration Venture 
plant reduction of 300 MW (from 1,263 MW to 
963 MW) 

16:09:17 EDT: Avon Lake 7 unit trips (82 MW) 

16:09:17 EDT: Burger 3, 4, and 5 units trip (355 
MW total) 

16:09:30 EDT: Kinder Morgan units 3, 6 and 7 
trip (209 MW total) 

The Burger units tripped after the 138-kV lines 
into the Burger 138-kV substation (Ohio) tripped 
from the low voltages in the Cleveland area (Fig¬ 
ure 6.10). The MCV plant is in central Michigan. 
Kinder Morgan is in south-central Michigan. The 
Kinder-Morgan units tripped due to a transformer 
fault and one due to over-excitation. 

Power flows into Michigan from Indiana 
increased to serve loads in eastern Michigan and 
northern Ohio (still connected to the grid through 
northwest Ohio and Michigan) and voltages 
dropped from the imbalance between high loads 


and limited transmission and generation 
capability. 

Phase 6: The Full Cascade 

Between 16:10:36 EDT and 16:13 EDT, thousands 
of events occurred on the grid, driven by physics 
and automatic equipment operations. When it was 
over, much of the northeastern United States and 
the province of Ontario were in the dark. 

Key Phase 6 Events 

Transmission Lines Disconnected Across 
Michigan and Northern Ohio, Generation Shut 
Down in Central Michigan and Northern Ohio, 
and Northern Ohio Separated from 
Pennsylvania: 16:10:36 to 16:10:39 EDT 

6A) Transmission and more generation tripped 
within Michigan: 16:10:36 to 16:10:37 EDT: 

16:10:36.2 EDT: Argenta-Battle Creek 345-kV 
line tripped 

16:10:36.3 EDT: Argenta-Tompkins 345-kV 
line tripped 

16:10:36.8 EDT: Battle Creek-Oneida 345-kV 
line tripped 

16:10:37 EDT: Sumpter Units 1, 2, 3, and 4 
units tripped on under-voltage (300 MW near 
Detroit) 

16:10:37.5 EDT: MCV Plant output dropped 
from 963 MW to 109 MW on over-current 
protection. 

Together, the above line outages interrupted the 
west-to-east transmission paths into the Detroit 
area from south-central Michigan. The Sumpter 
generation units tripped in response to 


Figure 6.10. Michigan and Ohio Power Plants Trip 
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under-voltage on the system. Michigan lines west 
of Detroit then began to trip, as shown in Figure 
6 . 11 . 

The Argenta-Battle Creek relay first opened the 
line at 16:10:36.230 EDT, reclosed it at 16:10:37, 
then tripped again. This line connects major gen¬ 
erators—including the Cook and Palisades 
nuclear plants and the Campbell fossil plant—to 
the MECS system. This line is designed with 
auto-reclose breakers at each end of the line, 
which do an automatic high-speed reclose as soon 
as they open to restore the line to service with no 
interruptions. Since the majority of faults on the 
North American grid are temporary, automatic 
reclosing can enhance stability and system reli¬ 
ability. However, situations can occur when the 
power systems behind the two ends of the line 
could go out of phase during the high-speed 
reclose period (typically less than 30 cycles, or one 
half second, to allow the air to de-ionize after the 
trip to prevent arc re-ignition). To address this and 
protect generators from the harm that an 
out-of-synchronism reconnect could cause, it is 
worth studying whether a synchro-check relay is 
needed, to reclose the second breaker only when 
the two ends are within a certain voltage and 
phase angle tolerance. No such protection was 
installed at Argenta-Battle Creek; when the line 
reclosed, there was a 70° difference in phase 
across the circuit breaker reclosing the line. There 


Figure 6.11. Transmission and Generation Trips in 
Michigan, 16:10:36 to 16:10:37 EDT 



is no evidence that the reclose caused harm to the 
local generators. 

6B) Western and Eastern Michigan separation 

started: 16:10:37 EDT to 16:10:38 EDT 

16:10:38.2 EDT: Hampton-Pontiac 345-kV 

line tripped 

16:10:38.4 EDT: Thetford-Jewell 345-kV line 

tripped 

After the Argenta lines tripped, the phase angle 
between eastern and western Michigan began to 
increase. The Hampton-Pontiac and Thetford- 
Jewell 345-kV lines were the only lines remaining 
connecting Detroit to power sources and the rest of 
the grid to the north and west. When these lines 
tripped out of service, it left the loads in Detroit, 
Toledo, Cleveland, and their surrounding areas 
served only by local generation and the lines north 
of Lake Erie connecting Detroit east to Ontario and 
the lines south of Lake Erie from Cleveland east to 
northwest Pennsylvania. These trips completed 
the extra-high voltage network separation 
between eastern and western Michigan. 

The Power System Disturbance Recorders at Keith 
and Lambton, Ontario, captured these events in 
the flows across the Ontario-Michigan interface, 
as shown in Figure 6.12 and Figure 6.16. It shows 
clearly that the west to east Michigan separation 
(the Thetford-Jewell trip) was the start and Erie 
West-Ashtabula-Perry was the trigger for the 3,700 
MW surge from Ontario into Michigan. When 
Thetford-Jewell tripped, power that had been 
flowing into Michigan and Ohio from western 
Michigan, western Ohio and Indiana was cut off. 
The nearby Ontario recorders saw a pronounced 
impact as flows into Detroit readjusted to draw 
power from the northeast instead. To the south, 

Figure 6.12. Flows on Keith-Waterman 230-kV 
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Erie West-Ashtabula-Perry was the last 345-kV 
eastern link for northern Ohio loads. When that 
line severed, all the power that moments before 
had flowed across Michigan and Ohio paths was 
now diverted in a counter-clockwise direction 
around Lake Erie through the single path left in 
eastern Michigan, pulling power out of Ontario, 
New York and PJM. 

Figures 6.13 and 6.14 show the results of investi¬ 
gation team modeling of the line loadings on the 
Ohio, Michigan, and other regional interfaces for 
the period between 16:05:57 until the Thetford- 
Jewell trip, to understand how power flows shifted 
during this period. The team simulated evolving 
system conditions on August 14, 2003, based on 
the 16:05:50 power flow case developed by the 
MAAC-ECAR-NPCC Operations Studies Working 
Group. Each horizontal line in the graph indicates 
a single or set of 345-kV lines and its loading as a 
function of normal ratings over time as first one, 
then another, set of circuits tripped out of service. 
In general, each subsequent line trip causes the 
remaining line loadings to rise; where a line drops 
(as Erie West-Ashtabula-Perry in Figure 6.13 after 
the Hanna-Juniper trip), that indicates that line 
loading lightened, most likely due to customers 
dropped from service. Note that Muskingum and 
East Lima-Fostoria Central were overloaded before 
they tripped, but the Michigan west and north 
interfaces were not overloaded before they trip¬ 
ped. Erie West-Ashtabula-Perry was loaded to 
130% after the Hampton-Pontiac and Thetford- 
Jewell trips. 

The Regional Interface Loadings graph (Figure 
6.14) shows that loadings at the interfaces 
between PJM-NY, NY-Ontario and NY-New Eng¬ 
land were well within normal ratings before the 
east-west Michigan separation. 


6C) Cleveland separated from Pennsylvania, 
flows reversed and a huge power surge 
flowed counter-clockwise around Lake Erie: 
16:10:38.6 EDT 

16:10:38.6 EDT: Erie West-Ashtabula-Perry 
345-kV line tripped at Perry 
16:10:38.6 EDT: Large power surge to serve 
loads in eastern Michigan and northern Ohio 
swept across Pennsylvania, New Jersey, and 
New York through Ontario into Michigan. 

Perry-Ashtabula was the last 345-kV line connect¬ 
ing northern Ohio to the east south of Lake Erie. 
This line’s trip at the Perry substation on a zone 3 
relay operation separated the northern Ohio 
345-kV transmission system from Pennsylvania 
and all eastern 345-kV connections. After this trip, 
the load centers in eastern Michigan and northern 
Ohio (Detroit, Cleveland, and Akron) remained 
connected to the rest of the Eastern Interconnec¬ 
tion only to the north at the interface between the 
Michigan and Ontario systems (Figure 6.15). East¬ 
ern Michigan and northern Ohio now had little 
internal generation left and voltage was declining. 
The frequency in the Cleveland area dropped rap¬ 
idly, and between 16:10:39 and 16:10:50 EDT 
under-frequency load shedding in the Cleveland 
area interrupted about 1,750 MW of load. How¬ 
ever, the load shedding did not drop enough load 
relative to local generation to rebalance and arrest 
the frequency decline. Since the electrical system 
always seeks to balance load and generation, the 
high loads in Detroit and Cleveland drew power 
over the only major transmission path remain¬ 
ing—the lines from eastern Michigan into Ontario. 
Mismatches between generation and load are 
reflected in changes in frequency, so with more 
generation than load frequency rises and with less 
generation than load, frequency falls. 


Figure 6.13. Simulated 345-kV Line Loadings from 



Figure 6.14. Simulated Regional Interface Loadings 
from 16:05:57 through 16:10:38.4 EDT 
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At 16:10:38.6 EDT, after the above transmission 
paths into Michigan and Ohio failed, the power 
that had been flowing at modest levels into Michi¬ 
gan from Ontario suddenly jumped in magnitude. 
While flows from Ontario into Michigan had been 
in the 250 to 350 MW range since 16:10:09.06 
EDT, with this new surge they peaked at 3,700 
MW at 16:10:39 EDT (Figure 6.16). Electricity 
moved along a giant loop through Pennsylvania 
and into New York and Ontario and then into 
Michigan via the remaining transmission path to 
serve the combined loads of Cleveland, Toledo, 
and Detroit. This sudden large change in power 
flows drastically lowered voltage and increased 
current levels on the transmission lines along the 
Pennsylvania-New York transmission interface. 

This was a power surge of large magnitude, so fre¬ 
quency was not the same across the Eastern Inter¬ 
connection. As Figure 6.16 shows, the power 
swing resulted in a rapid rate of voltage decay. 
Flows into Detroit exceeded 3,700 MW and 1,500 
MVAr—the power surge was draining real power 
out of the northeast, causing voltages in Ontario 
and New York to drop. At the same time, local 
voltages in the Detroit area were plummeting 
because Detroit had already lost 500 MW of local 
generation. Detroit would soon lose synchronism 


and black out (as evidenced by the rapid power 
oscillations decaying after 16:10:43 EDT). 


Figure 6.15. Michigan Lines Trip and Ohio 
Separates from Pennsylvania, 16:10:36 to 
16:10:38.6 EDT 



Modeling the Cascade 

Computer modeling of the cascade built upon the 
modeling conducted of the pre-cascade system 
conditions described in Chapter 5. That earlier 
modeling developed steady-state load flow and 
voltage analyses for the entire Eastern Intercon¬ 
nection from 15:00 to 16:05:50 EDT. The 
dynamic modeling used the steady state load 
flow model for 16:05:50 as the starting point to 
simulate the cascade. Dynamic modeling con¬ 
ducts a series of load flow analyses, moving from 
one set of system conditions to another in steps 
one-quarter of a cycle long—in other words, to 
move one second from 16:10:00 to 16:10:01 
requires simulation of 240 separate time slices. 

The model used a set of equations that incorpo¬ 
rate the physics of an electrical system. It 
contained detailed sub-models to reflect the 
characteristics of loads, under-frequency load¬ 
shedding, protective relay operations, generator 
operations (including excitation systems and 
governors), static VAr compensators and other 
FACTS devices, and transformer tap changers. 

The modelers compared model results at each 
moment to actual system data for that moment to 


verify a close correspondence for line flows and 
voltages. If there was too much of a gap between 
modeled and actual results, they looked at the 
timing of key events to see whether actual data 
might have been mis-recorded, or whether the 
modeled variance for an event not previously 
recognized as significant might influence the 
outcome. Through 16:10:40 EDT, the team 
achieved very close benchmarking of the model 
against actual results. 

The modeling team consisted of industry mem¬ 
bers from across the Midwest, Mid-Atlantic and 
NPCC areas. All have extensive electrical engi¬ 
neering and/or mathematical training and experi¬ 
ence as system planners for short- or long-term 
operations. 

This modeling allows the team to verify its 
hypotheses as to why particular events occurred 
and the relationships between different events 
over time. It allows testing of many “what if’ sce¬ 
narios and alternatives, to determine whether a 
change in system conditions might have pro¬ 
duced a different outcome. 
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Just before the Argenta-Battle Creek trip, when 
Michigan separated west to east at 16:10:37 EDT, 
almost all of the generators in the eastern intercon¬ 
nection were moving in synchronism with the 
overall grid frequency of 60 Hertz (shown at the 
bottom of Figure 6.17), but when the swing 
started, those machines absorbed some of its ener¬ 
gy as they attempted to adjust and resynchronize 
with the rapidly changing frequency. In many 

Figure 6.16. Active and Reactive Power and Voltage 



cases, this adjustment was unsuccessful and the 
generators tripped out from milliseconds to sev¬ 
eral seconds thereafter. 

The Perry-Ashtabula-Erie West 345-kV line trip at 
16:10:38.6 EDT was the point when the Northeast 
entered a period of transient instability and a loss 
of generator synchronism. Between 16:10:38 and 
16:10:41 EDT, the power swings caused a sudden 
extraordinary increase in system frequency, hit¬ 
ting 60.7 Hz at Lambton and 60.4 Hz at Niagara. 

Because the demand for power in Michigan, Ohio, 
and Ontario was drawing on lines through New 
York and Pennsylvania, heavy power flows were 
moving northward from New Jersey over the New 
York tie lines to meet those power demands, exac¬ 
erbating the power swing. Figure 6.17 shows 
actual net line flows summed across the interfaces 
between the main regions affected by these 
swings—Ontario into Michigan, New York into 
Ontario, New York into New England, and PJM 
into New York. This shows clearly that the power 
swings did not move in unison across every inter¬ 
face at every moment, but varied in magnitude 
and direction. This occurred for two reasons. First, 
the availability of lines to complete the path across 


Figure 6.17. Measured Power Flows and Frequency Across Regional Interfaces, 16:10:30 to 16:11:00 EDT, 
with Key Events in the Cascade 


16:10:36 16:10:38 16:10:41.9 16:10:45.2 16:10:48 16:10:50 16:10:56 
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each interface varied over time, as did the amount 
of load that drew upon each interface, so net flows 
across each interface were not facing consistent 
demand with consistent capability as the cascade 
progressed. Second, the speed and magnitude of 
the swing was moderated by the inertia, reactive 
power capabilities, loading conditions and loca¬ 
tions of the generators across the entire region. 

After Cleveland was cut off from Pennsylvania 
and eastern power sources, Figure 6.17 shows the 
start of the dynamic power swing at 16:10:38.6. 
Because the loads of Cleveland, Toledo and 
Detroit (less the load already blacked out) were 
now hanging off Michigan and Ontario, this forced 
a gigantic shift in power flows to meet that 
demand. As noted above, flows from Ontario into 
Michigan increased from 1,000 MW to 3,700 MW 
shortly after the start of the swing, while flows 
from PJM into New York were close behind. But 
within two seconds from the start of the swing, at 
16:10:40 EDT flows reversed and coursed back 
from Michigan into Ontario at the same time that 
frequency at the interface dropped, indicating that 
significant generation had been lost. Flows that 
had been westbound across the Ontario-Michigan 
interface by over 3,700 MW at 16:10:38.8 dropped 
down to 2,100 MW eastbound by 16:10:40, and 
then returned westbound starting at 16:10:40.5. 

A series of circuits tripped along the border 
between PJM and the NYISO due to zone 1 imped¬ 
ance relay operations on overload and depressed 
voltage. The surge also moved into New England 
and the Maritimes region of Canada. The combi¬ 
nation of the power surge and frequency rise 
caused 380 MW of pre-selected Maritimes genera¬ 
tion to drop off-line due to the operation of the 
New Brunswick Power “Loss of Line 3001” Special 
Protection System. Although this system was 
designed to respond to failure of the 345-kV link 
between the Maritimes and New England, it oper¬ 
ated in response to the effects of the power surge. 
The link remained intact during the event. 

6D) Conditions in Northern Ohio and Eastern 
Michigan Degraded Further, With More 
Transmission Lines and Power Plants Fail¬ 
ing: 16:10:39 to 16:10:46 EDT 

Line trips in Ohio and eastern Michigan: 

16:10:39.5 EDT: Bay Shore-Monroe 345-kV 
line 

16:10:39.6 EDT: Allen Junction-Majestic- 
Monroe 345-kV line 

16:10:40.0 EDT: Majestic-Lemoyne 345-kV 
line 


Majestic 345-kV Substation: one terminal 
opened sequentially on all 345-kV lines 

16:10:41.8 EDT: Fostoria Central-Galion 
345-kV line 

16:10:41.911 EDT: Beaver-Davis Besse 
345-kV line 

Under-frequency load-shedding in Ohio: 

FirstEnergy shed 1,754 MVA load 

AEP shed 133 MVA load 

Seven power plants, for a total of 3,294 MW of 
generation, tripped off-line in Ohio: 

16:10:42 EDT: Bay Shore Units 1-4 (551 MW 
near Toledo) tripped on over-excitation 

16:10:40 EDT: Lakeshore unit 18 (156 MW, 
near Cleveland) tripped on under-frequency 

16:10:41.7 EDT: Eastlake 1, 2, and 3 units 
(304 MW total, near Cleveland) tripped on 
under-frequency 

16:10:41.7 EDT: Avon Lake unit 9 (580 MW, 
near Cleveland) tripped on under-frequency 

16:10:41.7 EDT: Perry 1 nuclear unit (1,223 
MW, near Cleveland) tripped on under¬ 
frequency 

16:10:42 EDT: Ashtabula unit 5 (184 MW, 
near Cleveland) tripped on under-frequency 

16:10:43 EDT: West Lorain units (296 MW) 
tripped on under-voltage 

Four power plants producing 1,759 MW tripped 
off-line near Detroit: 

16:10:42 EDT: Greenwood unit 1 tripped (253 
MW) on low voltage, high current 

16:10:41 EDT: Belle River unit 1 tripped (637 
MW) on out-of-step 

16:10:41 EDT: St. Clair unit 7 tripped (221 
MW, DTE unit) on high voltage 

16:10:42 EDT: Trenton Channel units 7A, 8 
and 9 tripped (648 MW) 

Back in northern Ohio, the trips of the Bay 
Shore-Monroe, Majestic-Lemoyne, Allen Junc- 
tion-Majestic-Monroe 345-kV lines, and the 
Ashtabula 345/138-kV transformer cut off Toledo 
and Cleveland from the north, turning that area 
into an electrical island (Figure 6.18). Frequency 
in this large island began to fall rapidly. This 
caused a series of power plants in the area to trip 
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off-line due to the operation of under-frequency 
relays, including the Bay Shore units. When the 
Beaver-Davis Besse 345-kV line between Cleve¬ 
land and Toledo tripped, it left the Cleveland area 
completely isolated and area frequency rapidly 
declined. Cleveland area load was disconnected 
by automatic under-frequency load-shedding 
(approximately 1,300 MW), and another 434 MW 
of load was interrupted after the generation 
remaining within this transmission “island” was 
tripped by under-frequency relays. This sudden 
load drop would contribute to the reverse power 
swing. In its own island, portions of Toledo 
blacked out from automatic under-frequency 
load-shedding but most of the Toledo load was 
restored by automatic reclosing of lines such as 
the East Lima-Fostoria Central 345-kV line and 
several lines at the Majestic 345-kV substation. 

The Perry nuclear plant is in Ohio on Lake Erie, 
not far from the Pennsylvania border. The Perry 
plant was inside a decaying electrical island, 
and the plant tripped on under-frequency, as 
designed. A number of other units near Cleveland 
tripped off-line by under-frequency protection. 

The tremendous power flow into Michigan, begin¬ 
ning at 16:10:38, occurred when Toledo and 
Cleveland were still connected to the grid only 
through Detroit. After the Bay Shore-Monroe line 
tripped at 16:10:39, Toledo-Cleveland were sepa¬ 
rated into their own island, dropping a large 
amount of load off the Detroit system. This left 
Detroit suddenly with excess generation, much of 
which was greatly accelerated in angle as the 
depressed voltage in Detroit (caused by the high 
demand in Cleveland) caused the Detroit units to 
pull nearly out of step. With the Detroit generators 


Figure 6.18. Cleveland and Toledo Islanded, 
16:10:39 to 16:10:46 EDT 



running at maximum mechanical output, they 
began to pull out of synchronous operation with 
the rest of the grid. When voltage in Detroit 
returned to near-normal, the generators could not 
fully pull back its rate of revolutions, and ended 
up producing excessive temporary output levels, 
still out of step with the system. This is evident in 
Figure 6.19, which shows at least two sets of gen¬ 
erator “pole slips” by plants in the Detroit area 
between 16:10:40 EDT and 16:10:42 EDT. Several 
large units around Detroit—Belle River, St. Clair, 
Greenwood, Monroe, and Fermi—all tripped in 
response. After formation of the Cleveland-Toledo 
island at 16:10:40 EDT, Detroit frequency spiked 
to almost 61.7 Hz before dropping, momentarily 
equalized between the Detroit and Ontario sys¬ 
tems, but Detroit frequency began to decay at 2 
Hz/sec and the generators then experienced 
under-speed conditions. 

Re-examination of Figure 6.17 shows the power 
swing from the northeast through Ontario into 
Michigan and northern Ohio that began at 
16:10:37, and how it reverses and swings back 
around Lake Erie at 16:10:39 EDT. That return was 
caused by the combination of natural oscillations, 
accelerated by major load losses, as the northern 
Ohio system disconnected from Michigan. It 
caused a power flow change of 5,800 MW, from 
3,700 MW westbound to 2,100 eastbound across 
the Ontario to Michigan border between 
16:10:39.5 and 16:10:40 EDT. Since the system 
was now fully dynamic, this large oscillation east- 
bound would lead naturally to a rebound, which 
began at 16:10:40 EDT with an inflection point 
reflecting generation shifts between Michigan and 
Ontario and additional line losses in Ohio. 


Figure 6.19. Generators Under Stress in Detroit, 
as Seen from Keith PSDR 
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Western Pennsylvania Separated from New 
York: 16:10:39 EDT to 16:10:44 EDT 

6E) 16:10:39 EDT, Homer City-Watercure Road 

345 kV 

16:10:39 EDT: Homer City-Stolle Road 345 

kV 

6F) 16:10:44 EDT: South Ripley-Erie East 230 kV, 

and South Ripley-Dunkirk 230 kV 

16:10:44 EDT: East Towanda-Hillside 230 kV 

Responding to the swing of power out of Michigan 
toward Ontario and into New York and PJM, zone 
1 relays on the 345-kV lines separated Pennsylva¬ 
nia from New York (Figure 6.20). Homer 
City-Watercure (177 miles or 285 km) and Homer 
City-Stolle Road (207 miles or 333 km) are very 
long lines and so have high impedance. Zone 1 
relays do not have timers, and operate instantly 
when a power swing enters the relay target circle. 
For normal length lines, zone 1 relays have small 
target circles because the relay is measuring a less 
than the full length of the line—but for a long line 
the large line impedance enlarges the relay’s target 
circle and makes it more likely to be hit by the 
power swing. The Homer City-Watercure and 
Homer City-Stolle Road lines do not have zone 3 
relays. 

Given the length and impedance of these lines, it 
was highly likely that they would trip and separate 
early in the face of such large power swings. Most 
of the other interfaces between regions are on 
short ties—for instance, the ties between New 
York and Ontario and Ontario to Michigan are 
only about 2 miles (3.2 km) long, so they are elec¬ 
trically very short and thus have much lower 
impedance and trip less easily than these long 
lines. A zone 1 relay target for a short line covers a 


Figure 6.20. Western Pennsylvania Separates from 
New York, 16:10:39 EDT to 16:10:44 EDT 



small area so a power swing is less likely to enter 
the relay target circle at all, averting a zone 1 trip. 

At 16:10:44 EDT, the northern part of the Eastern 
Interconnection (including eastern Michigan) was 
connected to the rest of the Interconnection at 
only two locations: (1) in the east through the 
500-kV and 230-kV ties between New York and 
northeast New Jersey, and (2) in the west through 
the long and electrically fragile 230-kV transmis¬ 
sion path connecting Ontario to Manitoba and 
Minnesota. The separation of New York from 
Pennsylvania (leaving only the lines from New Jer¬ 
sey into New York connecting PJM to the north¬ 
east) buffered PJM in part from these swings. 
Frequency was high in Ontario at that point, indi¬ 
cating that there was more generation than load, 
so much of this flow reversal never got past 
Ontario into New York. 

6G) Transmission paths disconnected in New 
Jersey and northern Ontario, isolating the 
northeast portion of the Eastern 
Interconnection: 16:10:43 to 16:10:45 EDT 

16:10:43 EDT: Keith-Waterman 230-kV line 
tripped 

16:10:45 EDT: Wawa-Marathon 230-kV lines 
tripped 

16:10:45 EDT: Branchburg-Ramapo 500-kV line 
tripped 

At 16:10:43 EDT, eastern Michigan was still con¬ 
nected to Ontario, but the Keith-Waterman 
230-kV line that forms part of that interface dis¬ 
connected due to apparent impedance (Figure 
6.21). This put more power onto the remaining 
interface between Ontario and Michigan, but 


Figure 6.21. Northeast Separates from Eastern 
Interconnection, 16:10:45 EDT 



❖ U.S.-Canada Power System Outage Task Force V August 14th Blackout: Causes and Recommendations -O 


89 













triggered sustained oscillations in both power 
flow and frequency along the remaining 230-kV 
line. 

At 16:10:45 EDT, northwest Ontario separated 
from the rest of Ontario when the Wawa-Marathon 
230-kV lines (104 miles or 168 km long) discon¬ 
nected along the northern shore of Lake Superior, 
tripped by zone 1 distance relays at both ends. 
This separation left the loads in the far northwest 
portion of Ontario connected to the Manitoba and 
Minnesota systems, and protected them from the 
blackout. 

The 69-mile (111 km) long Branchburg-Ramapo 
500-kV line and Ramapo transformer between 
New Jersey and New York was the last major trans¬ 
mission path remaining between the Eastern Inter¬ 
connection and the area ultimately affected by the 
blackout. Figure 6.22 shows how that line discon¬ 
nected at 16:10:45 EDT, along with other underly¬ 
ing 230 and 138-kV lines in northeast New Jersey. 
Branchburg-Ramapo was carrying over 3,000 
MVA and 4,500 amps with voltage at 79% before it 
tripped, either on a high-speed swing into zone 1 
or on a direct transfer trip. The investigation team 
is still examining why the higher impedance 
230-kV overhead lines tripped while the under¬ 
ground Hudson-Farragut 230-kV cables did not; 
the available data suggest that the notably lower 
impedance of underground cables made these less 
vulnerable to the electrical strain placed on the 
system. 

This left the northeast portion of New Jersey con¬ 
nected to New York, while Pennsylvania and the 
rest of New Jersey remained connected to the rest 
of the Eastern Interconnection. Within northeast 


Figure 6.22. PJM to New York Interties Disconnect 




— Erie South-South Ripley — Homer City-Stolle Rd — Homer City-Watercure 

— Branchburg-Ramapo — Waldwick-Ramapo (J) — Waldwick-Ramapo (K) 

— Hudson-Farragut (B) — Hudson-Farragut (C) 



Note: The data in this figure come from the NYISO Energy 
Management System SDAC high speed analog system, which 
records 10 samples per second. 


New Jersey, the separation occurred along the 
230-kV corridors which are the main supply feeds 
into the northern New Jersey area (the two 
Roseland-Athenia circuits and the Lin- 
den-Bayway circuit). These circuits supply the 
large customer load in northern New Jersey and 
are a primary route for power transfers into New 
York City, so they are usually more highly loaded 
than other interfaces. These lines tripped west and 
south of the large customer loads in northeast New 
Jersey. 

The separation of New York, Ontario, and New 
England from the rest of the Eastern Interconnec¬ 
tion occurred due to natural breaks in the system 
and automatic relay operations, which performed 
exactly as they were designed to. No human inter¬ 
vention occurred by operators at PJM headquar¬ 
ters or elsewhere to effect this split. At this point, 
the Eastern Interconnection was divided into two 
major sections. To the north and east of the separa¬ 
tion point lay New York City, northern New Jer¬ 
sey, New York state, New England, the Canadian 
Maritime Provinces, eastern Michigan, the major¬ 
ity of Ontario, and the Quebec system. 

The rest of the Eastern Interconnection, to the 
south and west of the separation boundary, was 
not seriously affected by the blackout. Frequency 
in the Eastern Interconnection was 60.3 Hz at the 
time of separation; this means that approximately 
3,700 MW of excess generation that was on-line to 
export into the northeast was now in the main 
Eastern Island, separated from the load it had been 
serving. This left the northeast island with even 
less in-island generation on-line as it attempted to 
rebalance in the next phase of the cascade. 

Phase 7: 

Several Electrical Islands Formed 
in Northeast U.S. and Canada: 
16:10:46 EDT to 16:12 EDT 

Overview of This Phase 

During the next 3 seconds, the islanded northern 
section of the Eastern Interconnection broke apart 
internally. Figure 6.23 illustrates the events of this 
phase. 

7A) New York-New England upstate transmis¬ 
sion lines disconnected: 16:10:46 to 16:10:47 
EDT 

7B) New York transmission system split along 
Total East interface: 16:10:49 EDT 
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7C) The Ontario system just west of Niagara Falls 
and west of St. Lawrence separated from the 
western New York island: 16:10:50 EDT 

7D) Southwest Connecticut separated from New 
York City: 16:11:22 EDT 

7E) Remaining transmission lines between 
Ontario and eastern Michigan separated: 
16:11:57 EDT 

By this point most portions of the affected area 
were blacked out. 

If the 6th phase of the cascade was about dynamic 
system oscillations, the last phase is a story of the 
search for balance between loads and generation. 
Here it is necessary to understand three matters 
related to system protection—why the blackout 
stopped where it did, how and why under-voltage 
and under-frequency load-shedding work, and 
what happened to the generators on August 14 
and why. These matter because loads and genera¬ 
tion must ultimately balance in real-time to 
remain stable. When the grid is breaking apart into 
islands, if generators stay on-line longer, then the 
better the chances to keep the lights on within 
each island and restore service following a black¬ 
out; so automatic load-shedding, transmission 
relay protections and generator protections must 
avoid premature tripping. They must all be coordi¬ 
nated to reduce the likelihood of system break-up, 
and once break-up occurs, to maximize an island’s 
chances for electrical survival. 

Why the Blackout Stopped 
Where It Did 

Extreme system conditions can damage equip¬ 
ment in several ways, from melting aluminum 
conductors (excessive currents) to breaking tur¬ 
bine blades on a generator (frequency excursions). 
The power system is designed to ensure that if 
conditions on the grid (excessive or inadequate 
voltage, apparent impedance or frequency) 
threaten the safe operation of the transmission 
lines, transformers, or power plants, the threat¬ 
ened equipment automatically separates from the 
network to protect itself from physical damage. 
Relays are the devices that effect this protection. 

Generators are usually the most expensive units 
on an electrical system, so system protection 
schemes are designed to drop a power plant off 
the system as a self-protective measure if grid 
conditions become unacceptable. This protective 


measure leaves the generator in good condition to 
help rebuild the system once a blackout is over 
and restoration begins. When unstable power 
swings develop between a group of generators that 
are losing synchronization (unable to match fre¬ 
quency) with the rest of the system, one effective 
way to stop the oscillations is to stop the flows 
entirely by disconnecting the unstable generators 
from the remainder of the system. The most com¬ 
mon way to protect generators from power oscilla¬ 
tions is for the transmission system to detect the 
power swings and trip at the locations detecting 
the swings—ideally before the swing reaches criti¬ 
cal levels and harms the generator or the system. 

On August 14, the cascade became a race between 
the power surges and the relays. The lines that 
tripped first were generally the longer lines with 
relay settings using longer apparent impedance 
tripping zones and normal time settings. On 
August 14, relays on long lines such as the Homer 
City-Watercure and the Homer City-Stolle Road 
345-kV lines in Pennsylvania, that are not highly 
integrated into the electrical network, tripped 
quickly and split the grid between the sections 
that blacked out and those that recovered without 
further propagating the cascade. This same phe¬ 
nomenon was seen in the Pacific Northwest black¬ 
outs of 1996, when long lines tripped before more 
networked, electrically supported lines. 

Transmission line voltage divided by its current 
flow is called “apparent impedance.” Standard 
transmission line protective relays continuously 
measure apparent impedance. When apparent 
impedance drops within the line’s protective relay 
set-points for a given period of time, the relays trip 


Figure 6.23. New York and New England Separate, 
Multiple Islands Form 
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the line. The vast majority of trip operations on 
lines along the blackout boundaries between PJM 
and New York (for instance) show high-speed 
relay targets which indicate that a massive power 
surge caused each line to trip. To the relays, this 
power surge altered the voltages and currents 
enough that they appeared to be faults. The power 
surge was caused by power flowing to those areas 
that were generation-deficient (Cleveland, Toledo 
and Detroit) or rebounding back. These flows 
occurred purely because of the physics of power 
flows, with no regard to whether the power flow 
had been scheduled, because power flows from 
areas with excess generation into areas that were 
generation-deficient. 

Protective relay settings on transmission lines 
operated as they were designed and set to behave 
on August 14. In some cases line relays did not trip 
in the path of a power surge because the apparent 
impedance on the line was not low enough—not 
because of the magnitude of the current, but rather 
because voltage on that line was high enough that 
the resulting impedance was adequate to avoid 
entering the relay’s target zone. Thus relative volt¬ 
age levels across the northeast also affected which 
areas blacked out and which areas stayed on-line. 

In the U.S. Midwest, as voltage levels declined 
many generators in the affected area were operat¬ 
ing at maximum reactive power output before the 
blackout. This left the system little slack to deal 
with the low voltage conditions by ramping up 
more generators to higher reactive power output 
levels, so there was little room to absorb any sys¬ 
tem “bumps” in voltage or frequency. In contrast, 
in the northeast—particularly PJM, New York, and 
ISO-New England—operators were anticipating 
high power demands on the afternoon of August 
14, and had already set up the system to maintain 
higher voltage levels and therefore had more reac¬ 
tive reserves on-line in anticipation of later after¬ 
noon needs. Thus, when the voltage and 
frequency swings began, these systems had reac¬ 
tive power readily available to help buffer their 
areas against potential voltage collapse without 
widespread generation trips. 

The investigation team has used simulation to 
examine whether special protection schemes, 
designed to detect an impending cascade and sep¬ 
arate the grid at specific interfaces, could have 
been or should be set up to stop a power surge and 
prevent it from sweeping through an interconnec¬ 
tion and causing the breadth of line and generator 
trips and islanding that occurred that day. The 


team has concluded that such schemes would 
have been ineffective on August 14. 


Under-Frequency and 
Under-Voltage Load-Shedding 


Automatic load-shedding measures are designed 
into the electrical system to operate as a last resort, 
under the theory that it is wise to shed some load 
in a controlled fashion if it can forestall the loss of 
a great deal of load to an uncontrollable cause. 
Thus there are two kinds of automatic load-shed¬ 
ding installed in North America—under-voltage 
load-shedding, which sheds load to prevent local 
area voltage collapse, and under-frequency load¬ 
shedding, which is designed to rebalance load and 
generation within an electrical island once it has 
been created by a system disturbance. 


Automatic under-voltage load-shedding (UVLS) 
responds directly to voltage conditions in a local 
area. UVLS drops several hundred MW of load in 
pre-selected blocks within urban load centers, 
triggered in stages when local voltage drops to a 
designated level—likely 89 to 92% or even 
higher—with a several second delay. The goal of a 
UVLS scheme is to eliminate load in order to 
restore reactive power relative to demand, to pre¬ 
vent voltage collapse and contain a voltage prob¬ 
lem within a local area rather than allowing it to 
spread in geography and magnitude. If the first 
load-shed step does not allow the system to 
rebalance, and voltage continues to deteriorate, 
then the next block of UVLS is dropped. Use of 
UVLS is not mandatory, but is done at the option 
of the control area and/or reliability council. UVLS 
schemes and trigger points should be designed to 
respect the local area’s sys- 
vulnerabilities, based 


tern 


Recommendation 


21, page 158 


on voltage collapse studies. 

As noted in Chapter 4, there 
is no UVLS system in place within Cleveland and 
Akron; had such a scheme been implemented 
before August, 2003, shedding 1,500 MW of load 
in that area before the loss of the Sammis-Star line 
might have prevented the cascade and blackout. 


In contrast to UVLS, automatic under-frequency 
load-shedding (UFLS) is designed for use in 
extreme conditions to stabilize the balance 
between generation and load after an electrical 
island has been formed, dropping enough load to 
allow frequency to stabilize within the island. 
All synchronous generators in North America 
are designed to operate at 60 cycles per second 
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(Hertz) and frequency reflects how well load and 
generation are balanced—if there is more load 
than generation at any moment, frequency drops 
below 60 Hz, and it rises above that level if there is 
more generation than load. By dropping load to 
match available generation within the island, 
UFLS is a safety net that helps to prevent the com¬ 
plete blackout of the island, which allows faster 
system restoration afterward. UFLS is not effective 
if there is electrical instability or voltage collapse 
within the island. 

Today, UFLS installation is a NERC requirement, 
designed to shed at least 25-30% of the load in 
steps within each reliability coordinator region. 
These systems are designed to drop pre-desig- 
nated customer load automatically if frequency 
gets too low (since low frequency indicates too lit¬ 
tle generation relative to load), starting generally 
when frequency reaches 59.3 Hz. Progressively 
more load is set to drop as frequency levels fall far¬ 
ther. The last step of customer load shedding is set 
at the frequency level just above the set point for 
generation under-frequency protection relays 
(57.5 Hz), to prevent frequency from falling so low 
that generators could be damaged (see Figure 2.4). 

In NPCC, following the Northeast blackout of 
1965, the region adopted automatic under-fre¬ 
quency load-shedding criteria and manual load¬ 
shedding within ten minutes to prevent a recur¬ 
rence of the cascade and better protect system 
equipment from damage due to a high-speed sys¬ 
tem collapse. Under-frequency load-shedding 
triggers vary by regional reliability council—New 
York and all of the Northeast Power Coordinating 
Council, plus the Mid-Atlantic Area Council use 
59.3 Hz as the first step for UFLS, while ECAR 
uses 59.5 Hz as their first step for UFLS. 

The following automatic UFLS operated on the 
afternoon of August 14: 

♦ Ohio shed over 1,883 MVA beginning at 
16:10:39 EDT 

♦ Michigan shed a total of 2,835 MW 

♦ New York shed a total of 10,648 MW in numer¬ 
ous steps, beginning at 16:10:48 

♦ PJM shed a total of 1,324 MVA in 3 steps in 
northern New Jersey beginning at 16:10:48 EDT 

♦ Ontario shed a total of 7,800 MW in 2 steps, 
beginning at 16:10:4 

♦ New England shed a total of 1,098 MW. 


It must be emphasized that the entire northeast 
system was experiencing large scale, dynamic 
oscillations in this period. Even if the UFLS and 
generation had been perfectly balanced at any 
moment in time, these oscillations would have 
made stabilization difficult and unlikely. 

Why the Generators Tripped Off 

At least 265 power plants with more than 508 indi¬ 
vidual generating units shut down in the August 
14 blackout. These U.S. and Canadian plants can 
be categorized as follows: 

By reliability coordination area: 

♦ Hydro Quebec, 5 plants (all isolated onto the 
Ontario system) 4 

♦ Ontario, 92 plants 

♦ ISO-New England, 31 plants 

♦ MISO, 32 plants 

♦ New York ISO, 70 plants 

♦ PJM, 35 plants 

By type: 

♦ Conventional steam units, 66 plants (37 coal) 

♦ Combustion turbines, 70 plants (37 combined 
cycle) 

♦ Nuclear, 10 plants—7 U.S. and 3 Canadian, 
totaling 19 units (the nuclear unit outages are 
discussed in Chapter 8) 

♦ Hydro, 101 

♦ Other, 18. 

Within the overall cascade sequence, 29 (6%) gen¬ 
erators tripped between the start of the cascade at 
16:05:57 (the Sammis-Star trip) and the split 
between Ohio and Pennsylvania at 16:10:38.6 
EDT (Erie West-Ashtabula-Perry), which triggered 
the first big power swing. These trips were caused 
by the generators’ protective relays responding to 
overloaded transmission lines, so many of these 
trips were reported as under-voltage or over¬ 
current. The next interval in the cascade was as 
the portions of the grid lost synchronism, from 
16:10:38.6 until 16:10:45.2 EDT, when Michi¬ 
gan-New York-Ontario-New England separated 
from the rest of the Eastern Interconnection. Fifty 
more generators (10%) tripped as the islands 
formed, particularly due to changes in configura¬ 
tion, loss of synchronism, excitation system 
failures, with some under-frequency and under¬ 
voltage. In the third phase of generator losses, 431 
generators (84%) tripped after the islands formed, 
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many at the same time that under-frequency 
load-shedding was occurring. This is illustrated in 
Figure 6.24. It is worth noting, however, that many 
generators did not trip instantly after the trigger 
condition that led to the trip—rather, many relay 
protective devices operate on time delays of milli¬ 
seconds to seconds in duration, so that a generator 
that reported tripping at 16:10:43 on under¬ 
voltage or “generator protection” might have expe¬ 
rienced the trigger for that condition several sec¬ 
onds earlier. 


The high number of generators that tripped before 
formation of the islands helps to explain why so 
much of the northeast blacked out on August 14— 
many generators had pre-designed protection 
points that shut the unit down early in the cas¬ 
cade, so there were fewer units on-line to prevent 
island formation or to maintain balance between 
load and supply within 
each island after it formed. 

In particular, it appears that 
some generators tripped to protect the units from 
conditions that did not justify their protection, 
and many others were set to trip in ways that were 
not coordinated with the region’s under-frequency 
load-shedding, rendering that UFLS scheme less 
effective. Both factors compromised successful 
islanding and precipitated the blackouts in 
Ontario and New York. 


Recommendation 


21, page 158 


Most of the unit separations fell in the category of 
consequential tripping—they tripped off-line in 
response to some outside condition on the grid, 
not because of any problem internal to the plant. 
Some generators became completely removed 
from all loads; because the fundamental operating 
principle of the grid is that load and generation 
must balance, if there was no load to be served the 
power plant shut down in response to over-speed 
and/or over-voltage protection schemes. Others 
were overwhelmed because they were among a 
few power plants within an electrical island, and 
were suddenly called on to serve huge customer 
loads, so the imbalance caused them to trip on 
under-frequency and/or under-voltage protection. 
A few were tripped by special protection schemes 
that activated on excessive frequency or loss of 
pre-studied major transmission elements known 
to require large blocks of generation rejection. 

The large power swings and excursions of system 
frequency put all the units in their path through a 
sequence of major disturbances that shocked sev¬ 
eral units into tripping. Plant controls had actu¬ 
ated fast governor action on several of these to turn 
back the throttle, then turn it forward, only to turn 


it back again as some frequencies changed several 
times by as much as 3 Hz (about 100 times normal 
deviations). Figure 6.25 is a plot of the MW output 
and frequency for one large unit that nearly sur¬ 
vived the disruption but tripped when in-plant 
hydraulic control pressure limits were eventually 
violated. After the plant control system called for 
shutdown, the turbine control valves closed and 
the generator electrical output ramped down to a 
preset value before the field excitation tripped and 
the generator breakers opened to disconnect the 
unit from the system. This also illustrates the time 
lag between system events and the generator reac¬ 
tion—this generator was first disturbed by system 
conditions at 16:10:37, but did not trip until 
16:11:47, over a minute later. 

Under-frequency (10% of the generators report¬ 
ing) and under-voltage (6%) trips both reflect 
responses to system conditions. Although com¬ 
bustion turbines in particular are designed with 
under-voltage relay protection, it is not clear why 
this is needed. An under-voltage condition by 
itself and over a set time period may not necessar¬ 
ily be a generator hazard (although it could affect 
plant auxiliary systems). Some generator under¬ 
voltage relays were set to trip at or above 90% volt¬ 
age. However, a motor stalls out at about 70% volt¬ 
age and a motor starter contactor drops out around 
75%, so if there is a compelling need to protect the 
turbine from the system the under-voltage trigger 
point should be no higher than 80%. 

An excitation failure is closely related to a voltage 
trip. As local voltages decreased, so did frequency. 
Over-excitation operates on a calculation of 
volts/hertz, so as frequency declines faster than 
voltage over-excitation relays would operate. It is 
not clear that these relays were coordinated with 
each machine’s exciter controls, to be sure that it 
was protecting the machine for the proper range of 
its control capabilities. Large units have two relays 
to detect volts/Hz—one at the generator and one at 
the transformer, each with a slightly different 
volts/Hz setting and time delay. It is possible that 
these settings can cause a generator to trip within 
a generation-deficient island as frequency is 
attempting to rebalance, so these settings should 
be carefully evaluated. 

The Eastlake 5 trip at 13:31 EDT was an excitation 
system failure—as voltage fell at the generator 
bus, the generator tried to increase quickly its pro¬ 
duction of voltage on the AC winding of the 
machine quickly. This caused the generator’s exci¬ 
tation protection scheme to trip the plant off to 
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Figure 6.24. Generator Trips by Time and Cause 
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protect its windings and coils from over-heating. 
Several of the other generators which tripped 
early in the cascade came off under similar cir¬ 
cumstances as excitation systems were over¬ 
stressed to hold voltages up. Seventeen generators 
reported tripping for over-excitation. Units that 
trip for a cause related to frequency should be 
evaluated to determine how the unit frequency 
triggers coordinate with the region’s under-fre¬ 
quency load-shedding scheme, to assure that the 
generator trips are sequenced to follow rather than 
precede load-shedding. After UFLS operates to 
drop a large block of load, frequency continues to 
decline for several cycles before rebounding, so it 
is necessary to design an adequate time delay into 
generators’ frequency-related protections to keep 
it on-line long enough to help rebalance against 
the remaining load. 

Fourteen generators reported tripping for under¬ 
excitation (also known as loss of field), which pro¬ 
tects the generator from exciter component fail¬ 
ures. This protection scheme can operate on stable 
as well as transient power swings, so should be 
examined to determine whether the protection 
settings are appropriate. Eighteen units—primar¬ 
ily combustion turbines—reported over-current as 
the reason for relay operation. 

Some generators in New York failed in a way that 
exacerbated frequency decay. A generator that 
tripped due to a boiler or steam problem may have 
done so to prevent damage due to over-speed and 
limit impact to the turbine-generator shaft when 
the breakers are opened, and it will attempt to 
maintain its synchronous speed until the genera¬ 
tor is tripped. To do this, the mechanical part of 
the system would shut off the steam flow. This 
causes the generator to consume a small amount 

Figure 6.25. Events at One Large Generator During 
the Cascade 


MW Hertz 



of power off the grid to support the unit’s orderly 
slow-down and trip due to reverse power flow. 
This is a standard practice to avoid turbine 
over-speed. Also within New York, 16 gas turbines 
totaling about 400 MW reported tripping for loss 
of fuel supply, termed “flame out.” These units’ 
trips should be better understood. 


Another reason for power plant trips was actions 
or failures of plant control systems. One common 
cause in this category was a loss of sufficient volt¬ 
age to in-plant loads. Some plants run their inter¬ 
nal cooling and processes (house electrical load) 
off the generator or off small, in-house auxiliary 
generators, while others take their power off the 
main grid. When large power swings or voltage 
drops reached these plants in the latter category, 
they tripped off-line because the grid could not 
supply the plant’s in-house power needs reliably. 
At least 17 units reported tripping due to loss of 
system configuration, including the loss of a trans¬ 
mission or distribution line 
to serve the in-plant loads. 

Some generators were trip¬ 
ped by their operators. 


Recommendation 


11, page 148 


Unfortunately, 40% of the generators that went 
off-line during or after the cascade did not provide 
useful information on the cause of tripping in their 
response to the NERC investigation data request. 
While the responses available offer significant and 
valid information, the investigation team will 
never be able to fully analyze and explain why so 
many generators tripped off-line so early in the 
cascade, contributing to the speed and extent of 
the blackout. It is clear that every generator should 
have some minimum of protection for stator dif¬ 
ferential, loss of field, and out-of-step protection, 
to disconnect the unit from the grid when it is not 
performing correctly, and also protection for pro¬ 
tect the generator from extreme conditions on the 
grid that could cause catastrophic damage to the 
generator. These protections should be set tight 
enough to protect the unit from the grid, but also 
wide enough to assure that the unit remains con¬ 
nected to the grid as long as possible. This coordi¬ 
nation is a risk management issue that must 
balance the needs of the grid 
and customers relative to 
the needs of the individual 
assets. 


Recommendation 


21, page 158 


Key Phase 7 Events 

Electric loads and flows do not respect political 
boundaries. After the blackout of 1965, as loads 
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grew within New York City and neighboring 
northern New Jersey, the utilities serving the area 
deliberately increased the integration between the 
systems serving this area to increase the flow 
capability into New York and the reliability of the 
system as a whole. The combination of the facili¬ 
ties in place and the pattern of electrical loads and 
flows on August 14 caused New York to be tightly 
linked electrically to northern New Jersey and 
southwest Connecticut, and moved the weak 
spots on the grid out past this combined load and 
network area. 

Figure 6.26 gives an overview of the power flows 
and frequencies in the period 16:10:45 EDT 
through 16:11:00 EDT, capturing most of the key 
events in Phase 7. 

7A) New York-New England Transmission 
Lines Disconnected: 16:10:46 to 16:10:54 EDT 

Over the period 16:10:46 EDT to 16:10:54 EDT, the 
separation between New England and New York 
occurred. It occurred along five of the northern tie 
lines, and seven lines within southwest Connecti¬ 
cut. At the time of the east-west separation in New 
York at 16:10:49 EDT, New England was isolated 


from the eastern New York island. The only 
remaining tie was the PV-20 circuit connecting 
New England and the western New York island, 
which tripped at 16:10:54 EDT. Because New Eng¬ 
land was exporting to New York before the distur¬ 
bance across the southwest Connecticut tie, but 
importing on the Northwalk-Northport tie, the 
Pleasant Valley path opened east of Long Moun¬ 
tain—in other words, internal to southwest Con¬ 
necticut—rather than along the actual New 
York-New England tie. 5 Immediately before the 
separation, the power swing out of New England 
occurred because the New England generators had 
increased output in response to the drag of power 
through Ontario and New York into Michigan and 
Ohio. 6 The power swings continuing through the 
region caused this separation, and caused Ver¬ 
mont to lose approximately 70 MW of load. 

When the ties between New York and New Eng¬ 
land disconnected, most of the New England area 
along with Canada’s Maritime Provinces (New 
Brunswick and Nova Scotia) became an island 
with generation and demand balanced close 
enough that it was able to remain operational. The 
New England system had been exporting close to 


Figure 6.26. Measured Power Flows and Frequency Across Regional Interfaces, 16:10:45 to 16:11:30 EDT, 
with Key Events in the Cascade 


16:10:45.2 16:10:49 16:10:56 16:11:10 
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600 MW to New York, so it was relatively genera¬ 
tion-rich and experienced continuing fluctuations 
until it reached equilibrium. Before the Maritimes 
and New England separated from the Eastern 
Interconnection at approximately 16:11 EDT, volt¬ 
ages became depressed across portions of New 
England and some large customers disconnected 
themselves automatically. 7 However, southwest¬ 
ern Connecticut separated from New England and 
remained tied to the New York system for about 
one minute. 

While frequency within New England wobbled 
slightly and recovered quickly after 16:10:40 EDT, 
frequency of the New York-Ontario-Michigan- 
Ohio island fluctuated severely as additional 
lines, loads and generators tripped, reflecting the 
severe generation deficiency in Michigan and 
Ohio. 

Due to its geography and electrical characteristics, 
the Quebec system in Canada is tied to the remain¬ 
der of the Eastern Interconnection via high voltage 
DC (HVDC) links instead of AC transmission lines. 
Quebec was able to survive the power surges with 
only small impacts because the DC connections 
shielded it from the frequency swings. 

7B) New York Transmission Split East-West: 
16:10:49 EDT 

The transmission system split internally within 
New York along the Total East interface, with the 
eastern portion islanding to contain New York 
City, northern New Jersey, and southwestern Con¬ 
necticut. The eastern New York island had been 
importing energy, so it did not have enough sur¬ 
viving generation on-line to balance load. Fre¬ 
quency declined quickly to below 58.0 Hz and 
triggered 7,115 MW of automatic UFLS. 8 Fre¬ 
quency declined further, as did voltage, causing 
pre-designed trips at the Indian Point nuclear 
plant and other generators in and around New 
York City through 16:11:10 EDT. The western por¬ 
tion of New York remained connected to Ontario 
and eastern Michigan. 

The electric system has inherent weak points that 
vary as a function of the characteristics of the 
physical lines and plants and the topology of the 
lines, loads and flows across the grid at any point 
in time. The weakest points on a system tend to be 
those points with the highest impedance, which 
routinely are long (over 50 miles or 80 km) over¬ 
head lines with high loading. When such lines 
have high-speed relay protections that may trip on 


high current and overloads in addition to true 
faults, they will trip out before other lines in the 
path of large power swings such as the 3,500 MW 
power surge that hit New York on August 14. New 
York’s Total East and Central East interfaces, 
where the internal split occurred, are routinely 
among the most heavily loaded paths in the state 
and are operated under thermal, voltage and sta¬ 
bility limits to respect their relative vulnerability 
and importance. 

Examination of the loads and generation in the 
Eastern New York island indicates before 16:10:00 
EDT, the area had been importing electricity and 
had less generation on-line than load. At 16:10:50 
EDT, seconds after the separation along the Total 
East interface, the eastern New York area had 
experienced significant load reductions due to 
under-frequency load-shedding—Consolidated 
Edison, which serves New York City and sur¬ 
rounding areas, dropped over 40% of its load on 
automatic UFLS. But at this time, the system was 
still experiencing dynamic conditions—as illus¬ 
trated in Figure 6.26, frequency was falling, flows 
and voltages were oscillating, and power plants 
were tripping off-line. 

Had there been a slow islanding situation and 
more generation on-line, it might have been possi¬ 
ble for the Eastern New York island to rebalance 
given its high level of UFLS. But the available 
information indicates that events happened so 
quickly and the power swings were so large that 
rebalancing would have been unlikely, with or 
without the northern New Jersey and southwest 
Connecticut loads hanging onto eastern New 
York. This was further complicated because the 
high rate of change in voltages at load buses 
reduced the actual levels of load shed by UFLS rel¬ 
ative to the levels needed and expected. 

The team could not find any way that one electri¬ 
cal region might have protected itself against the 
August 14 blackout, either at electrical borders or 
internally. The team also looked at whether it was 
possible to design special protection schemes to 
separate one region from its neighborings pro¬ 
actively, to buffer itself from a power swing before 
it hit. This was found to be inadvisable for two rea¬ 
sons: (1) as noted above, the act of separation itself 
could cause oscillations and dynamic instability 
that could be as damaging to the system as the 
swing it was protecting against; and (2) there was 
no event or symptom on August 14 that could be 
used to trigger such a protection scheme in time. 


98 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 


7C) The Ontario System Just West of Niagara 
Falls and West of St. Lawrence Separated from 
the Western New York Island: 16:10:50 EDT 

At 16:10:50 EDT, Ontario and New York separated 
west of the Ontario/New York interconnection, 
due to relay operations which disconnected nine 
230-kV lines within Ontario. These left most of 
Ontario isolated to the north. Ontario’s large Beck 
and Saunders hydro stations, along with some 
Ontario load, the New York Power Authority’s 
(NYPA) Niagara and St. Lawrence hydro stations, 
and NYPA’s 765-kV AC interconnection to their 
HVDC tie with Quebec, remained connected to the 
western New York system, supporting the demand 
in upstate New York. 


7D) Southwest Connecticut Separated from 
New York City: 16:11:22 EDT 

In southwest Connecticut, when the Long Moun¬ 
tain-Plum Tree line (connected to the Pleasant 
Valley substation in New York) disconnected at 
16:11:22 EDT, it left about 500 MW of southwest 
Connecticut demand supplied only through a 
138-kV underwater tie to Long Island. About two 
seconds later, the two 345-kV circuits connecting 
southeastern New York to Long Island tripped, 
isolating Long Island and southwest Connecticut, 
which remained tied together by the underwater 
Norwalk Harbor-to-Northport 138-kV cable. The 
cable tripped about 20 seconds later, causing 
southwest Connecticut to black out. 


From 16:10:49 to 16:10:50 EDT, frequency in 
Ontario declined below 59.3 Hz, initiating auto¬ 
matic under-frequency load-shedding (3,000 
MW). This load-shedding dropped about 12% of 
Ontario’s remaining load. Between 16:10:50 EDT 
and 16:10:56 EDT, the isolation of Ontario’s 2,300 
MW Beck and Saunders hydro units onto the 
western New York island, coupled with 
under-frequency load-shedding in the western 
New York island, caused the frequency in this 
island to rise to 63.4 Hz due to excess generation 
relative to the load within the island (Figure 6.27). 
The high frequency caused trips of five of the U.S. 
nuclear units within the island, and the last one 
tripped on the second frequency rise. 

Three of the tripped 230-kV transmission circuits 
near Niagara automatically reconnected Ontario 
to New York at 16:10:56 EDT by reclosing. Even 
with these lines reconnected, the main Ontario 
island (still attached to New York and eastern 
Michigan) was then extremely deficient in genera¬ 
tion, so its frequency declined towards 58.8 Hz, 
the threshold for the second stage of under¬ 
frequency load-shedding. Within the next two sec¬ 
onds another 19% of Ontario demand (4,800 MW) 
automatically disconnected by under-frequency 
load-shedding. At 16:11:10 EDT, these same three 
lines tripped a second time west of Niagara, and 
New York and most of Ontario separated for a final 
time. Following this separation, the frequency in 
Ontario declined to 56 Hz by 16:11:57 EDT. With 
Ontario still supplying 2,500 MW to the Michi- 
gan-Ohio load pocket, the remaining ties with 
Michigan tripped at 16:11:57 EDT. Ontario system 
frequency declined, leading to a widespread shut¬ 
down at 16:11:58 EDT and the loss of 22,500 MW 
of load in Ontario, including the cities of Toronto, 
Hamilton, and Ottawa. 


Within the western New York island, the 345-kV 
system remained intact from Niagara east to the 
Utica area, and from the St. Lawrence/Plattsburgh 
area south to the Utica area through both the 
765-kV and 230-kV circuits. Ontario’s Beck and 
Saunders generation remained connected to New 
York at Niagara and St. Lawrence, respectively, 
and this island stabilized with about 50% of the 
pre-event load remaining. The boundary of this 
island moved southeastward as a result of the 
reclosure of Fraser-to-Coopers Corners 345-kV 
line at 16:11:23 EDT. 

As a result of the severe frequency and voltage 
changes, many large generating units in New York 
and Ontario tripped off-line. The eastern island of 
New York, including the heavily populated areas 
of southeastern New York, New York City, and 
Long Island, experienced severe frequency and 
voltage declines. At 16:11:29 EDT, the New Scot- 
land-to-Leeds 345-kV circuits tripped, separating 
the island into northern and southern sections. 
The small remaining load in the northern portion 
of the eastern island (the Albany area) retained 


Figure 6.27. Frequency Separation Between Ontario 
and Western New York 
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electric service, supplied by local generation until 
it could be resynchronized with the western New 
York island. 

7E) Remaining Transmission Lines Between 
Ontario and Eastern Michigan Separated: 
16:11:57 EDT 

Before the blackout, New England, New York, 
Ontario, eastern Michigan, and northern Ohio 
were scheduled net importers of power. When the 
western and southern lines serving Cleveland, 
Toledo, and Detroit collapsed, most of the load 
remained on those systems, but some generation 
had tripped. This exacerbated the generation/load 
imbalance in areas that were already importing 
power. The power to serve this load came through 
the only major path available, via Ontario (IMO). 
After most of IMO was separated from New York 
and generation to the north and east, much of the 
Ontario load and generation was lost; it took only 
moments for the transmission paths west from 
Ontario to Michigan to fail. 

When the cascade was over at about 16:12 EDT, 
much of the disturbed area was completely 
blacked out, but there were isolated pockets that 
still had service because load and generation had 
reached equilibrium. Ontario’s large Beck and 
Saunders hydro stations, along with some Ontario 
load, the New York Power Authority’s (NYPA) 
Niagara and St. Lawrence hydro stations, and 
NYPA’s 765-kV AC interconnection to the Quebec 
HVDC tie, remained connected to the western 
New York system, supporting demand in upstate 
New York. 

Electrical islanding. Once the northeast became 
isolated, it lost more and more generation relative 
to load as more and more power plants tripped 

Figure 6.28. Electric Islands Reflected in 



off-line to protect themselves from the growing 
disturbance. The severe swings in frequency and 
voltage in the area caused numerous lines to trip, 
so the isolated area broke further into smaller 
islands. The load/generation mismatch also 
affected voltages and frequency within these 
smaller areas, causing further generator trips and 
automatic under-frequency load-shedding, lead¬ 
ing to blackout in most of these areas. 

Figure 6.28 shows frequency data collected by the 
distribution-level monitors of Softswitching Tech¬ 
nologies, Inc. (a commercial power quality com¬ 
pany serving industrial customers) for the area 
affected by the blackout. The data reveal at least 
five separate electrical islands in the Northeast as 
the cascade progressed. The two paths of red dia¬ 
monds on the frequency scale reflect the Albany 
area island (upper path) versus the New York City 
island, which declined and blacked out much 
earlier. 

Cascading Sequence Essentially Complete: 
16:13 EDT 

Most of the Northeast (the area shown in gray in 
Figure 6.29) was now blacked out. Some isolated 
areas of generation and load remained on-line for 
several minutes. Some of those areas in which a 
close generation-demand balance could be main¬ 
tained remained operational. 

One relatively large island remained in operation 
serving about 5,700 MW of demand, mostly in 
western New York, anchored by the Niagara and 
St. Lawrence hydro plants. This island formed the 
basis for restoration in both New York and 
Ontario. 

The entire cascade sequence is depicted graphi¬ 
cally in Figure 6.30. 


Figure 6.29. Area Affected by the Blackout 
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Figure 6.30. Cascade Sequence 


1 . 

16:05:57 


2 . 

16:05:58 





3. 

16:09:25 




7. 

16:10:41 


8 . 

16:10:44 





Legend: Yellow arrows represent the overall pattern of electricity flows. Black lines represent approximate points of separation 
between areas within the Eastern Interconnect. Gray shading represents areas affected by the blackout. 
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Endnotes 

1 New York Independent System Operator, Interim Report on 
the August 14, 2003 Blackout, January 8, 2004, p. 14. 

2 Ibid., p. 14. 

3 These zone 2s are set on the 345-kV lines into the Argenta 
substation. The lines are owned by Michigan Electric Trans¬ 
mission Company and maintained by Consumers Power. 
Since the blackout occurred, Consumers Power has 
proactively changed the relay setting from 88 Ohms to 55 
Ohms to reduce the reach of the relay. Source: Charles Rogers, 
Consumers Power. 

4 The province of Quebec, although considered a part of the 
Eastern Interconnection, is connected to the rest of the East¬ 
ern Interconnection only by DC ties. In this instance, the DC 
ties acted as buffers between portions of the Eastern Intercon¬ 
nection; transient disturbances propagate through them less 
readily. Therefore, the electricity system in Quebec was not 
affected by the outage, except for a small portion of the prov¬ 
ince’s load that is directly connected to Ontario by AC trans¬ 
mission lines. (Although DC ties can act as a buffer between 
systems, the tradeoff is that they do not allow instantaneous 
generation support following the unanticipated loss of a gen¬ 
erating unit.) 


5 New York Independent System Operator, Interim Report on 
the August 14, 2003 Blackout, January 8, 2004, p. 20. 

6 Ibid., p. 20. 

7 After New England’s separation from the Eastern Intercon¬ 
nection occurred, the next several minutes were critical to 
stabilizing the ISO-NE system. Voltages in New England 
recovered and over-shot to high due to the combination of 
load loss, capacitors still in service, lower reactive losses on 
the transmission system, and loss of generation to regulate 
system voltage. Over-voltage protective relays operated to trip 
both transmission and distribution capacitors. Operators in 
New England brought all fast-start generation on-line by 
16:16 EDT. Much of the customer process load was automati¬ 
cally restored. This caused voltages to drop again, putting 
portions of New England at risk of voltage collapse. Operators 
manually dropped 80 MW of load in southwest Connecticut 
by 16:39 EDT, another 325 MW in Connecticut and 100 MW 
in western Massachusetts by 16:40 EDT. These measures 
helped to stabilize their island following their separation 
from the rest of the Eastern Interconnection. 

8 New York Independent System Operator, Interim Report on 
the August 14, 2003 Blackout, January 8, 2004, p. 23. 
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7 . The August 14 Blackout Compared With 
Previous Major North American Outages 


Incidence and Characteristics 
of Power System Outages 

Short, localized outages occur on power systems 
fairly frequently. System-wide disturbances that 
affect many customers across a broad geographic 
area are rare, but they occur more frequently than 
a normal distribution of probabilities would pre¬ 
dict. North American power system outages 
between 1984 and 1997 are shown in Figure 7.1 by 
the number of customers affected and the rate of 
occurrence. While some of these were widespread 
weather-related events, some were cascading 
events that, in retrospect, were preventable. Elec¬ 
tric power systems are fairly robust and are capa¬ 
ble of withstanding one or two contingency 
events, but they are fragile with respect to multi¬ 
ple contingency events unless the systems are 
readjusted between contingencies. With the 
shrinking margin in the current transmission sys¬ 
tem, it is likely to be more vulnerable to cascading 
outages than it was in the past, unless effective 
countermeasures are taken. 

As evidenced by the absence of major transmis¬ 
sion projects undertaken in North America over 
the past 10 to 15 years, utilities have found ways to 
increase the utilization of their existing facilities 
to meet increasing demands without adding sig¬ 
nificant high-voltage equipment. Without inter¬ 
vention, this trend is likely to continue. Pushing 
the system harder will undoubtedly increase reli¬ 
ability challenges. Special protection schemes 
may be relied on more to deal with particular chal¬ 
lenges, but the system still will be less able to 
withstand unexpected contingencies. 

A smaller transmission margin for reliability 
makes the preservation of system reliability a 
harder job than it used to be. The system is being 
operated closer to the edge of reliability than it 
was just a few years ago. Table 7.1 represents some 
of the changed conditions that make the preserva¬ 
tion of reliability more challenging. 


Figure 7.1. North American Power System Outages, 
1984-1997 



Note: The circles represent individual outages in North 
America between 1984 and 1997, plotted against the fre¬ 
quency of outages of equal or greater size over that period. 

Source: Adapted from John Doyle, California Institute of 
Technology, “Complexity and Robustness,” 1999. Data from 
NERC. 

If nothing else changed, one could expect an 
increased frequency of large-scale events as com¬ 
pared to historical experience. The last and most 
extreme event shown in Figure 7.1 is the August 
10, 1996, outage. August 14, 2003, surpassed that 
event in terms of severity. In addition, two signifi¬ 
cant outages in the month of September 2003 
occurred abroad: one in England and one, initiated 
in Switzerland, that cascaded over much of Italy. 

In the following sections, seven previous outages 
are reviewed and compared with the blackout of 
August 14, 2003: (1) Northeast blackout on 
November 9, 1965; (2) New York City blackout on 
July 13, 1977; (3) West Coast blackout on Decem¬ 
ber 22, 1982; (4) West Coast blackout on July 2-3, 
1996; (5) West Coast blackout on August 10,1996; 
(6) Ontario and U.S. North Central blackout on 
June 25, 1998; and (7) Northeast outages and non¬ 
outage disturbances in the summer of 1999. 
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Outage Descriptions 
and Major Causal Factors 

November 9, 1965: Northeast Blackout 

This disturbance resulted in the loss of over 
20,000 MW of load and affected 30 million people. 
Virtually all of New York, Connecticut, Massachu¬ 
setts, Rhode Island, small segments of northern 
Pennsylvania and northeastern New Jersey, and 
substantial areas of Ontario, Canada, were 
affected. Outages lasted for up to 13 hours. This 
event resulted in the formation of the North Amer¬ 
ican Electric Reliability Council in 1968. 

A backup protective relay operated to open one of 
five 230-kV lines taking power north from a gener¬ 
ating plant in Ontario to the Toronto area. When 
the flows redistributed instantaneously on the 
remaining four lines, they tripped out succes¬ 
sively in a total of 2.5 seconds. The resultant 
power swings resulted in a cascading outage that 
blacked out much of the Northeast. 

The major causal factors were as follows: 

♦ Operation of a backup protective relay took a 
230-kV line out of service when the loading on 
the line exceeded the 375-MW relay setting. 

♦ Operating personnel were not aware of the 
operating set point of this relay. 

♦ Another 230-kV line opened by an overcurrent 
relay action, and several 115- and 230-kV lines 
opened by protective relay action. 


♦ Two key 345-kV east-west (Rochester-Syracuse) 
lines opened due to instability, and several 
lower voltage lines tripped open. 

♦ Five of 16 generators at the St. Lawrence 
(Massena) plant tripped automatically in 
accordance with predetermined operating 
procedures. 

♦ Following additional line tripouts, 10 generat¬ 
ing units at Beck were automatically shut down 
by low governor oil pressure, and 5 pumping 
generators were tripped off by overspeed gover¬ 
nor control. 

♦ Several other lines then tripped out on 
under-frequency relay action. 

July 13, 1977: New York City Blackout 

This disturbance resulted in the loss of 6,000 MW 
of load and affected 9 million people in New York 
City. Outages lasted for up to 26 hours. A series of 
events triggering the separation of the Consoli¬ 
dated Edison system from neighboring systems 
and its subsequent collapse began when two 
345-kV lines on a common tower in Northern 
Westchester were struck by lightning and tripped 
out. Over the next hour, despite Consolidated Edi¬ 
son dispatcher actions, the system electrically 
separated from surrounding systems and col¬ 
lapsed. With the loss of imports, generation in 
New York City was not sufficient to serve the load 
in the city. 

Major causal factors were: 


Table 7.1. Changing Conditions That Affect System Reliability 


Previous Conditions 

Emerging Conditions 

Fewer, relatively large resources 

Smaller, more numerous resources 

Long-term, firm contracts 

Contracts shorter in duration 

More non-firm transactions, fewer long-term firm transactions 

Bulk power transactions relatively stable and predictable 

Bulk power transactions relatively variable and less predictable 

Assessment of system reliability made from stable base 
(narrower, more predictable range of potential operating 
states) 

Assessment of system reliability made from variable base 
(wider, less predictable range of potential operating states) 

Limited and knowledgable set of utility players 

More players making more transactions, some with less 
interconnected operation experience; increasing with retail 
access 

Unused transmission capacity and high security margins 

High transmission utilization and operation closer to security 
limits 

Limited competition, little incentive for reducing reliability 
investments 

Utilities less willing to make investments in transmission 
reliability that do not increase revenues 

Market rules and reliability rules developed together 

Market rules undergoing transition, reliability rules developed 
separately 

Limited wheeling 

More system throughput 
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♦ Two 345-kV lines connecting Buchanan South 
to Millwood West experienced a phase B to 
ground fault caused by a lightning strike. 

♦ Circuit breaker operations at the Buchanan 
South ring bus isolated the Indian Point No. 3 
generating unit from any load, and the unit trip¬ 
ped for a rejection of 883 MW of load. 

♦ Loss of the ring bus isolated the 345-kV tie to 
Ladentown, which had been importing 427 
MW, making the cumulative resources lost 
1,310 MW. 

♦ 18.5 minutes after the first incident, an addi¬ 
tional lightning strike caused the loss of two 
345-kV lines, which connect Sprain Brook to 
Buchanan North and Sprain Brook to Millwood 
West. These two 345-kV lines share common 
towers between Millwood West and Sprain 
Brook. One line (Sprain Brook to Millwood 
West) automatically reclosed and was restored 
to service in about 2 seconds. The failure of the 
other line to reclose isolated the last Consoli¬ 
dated Edison interconnection to the Northwest. 

♦ The resulting surge of power from the North¬ 
west caused the loss of the Pleasant Valley to 
Millwood West line by relay action (a bent con¬ 
tact on one of the relays at Millwood West 
caused the improper action). 

♦ 23 minutes later, the Leeds to Pleasant Valley 
345-kV line sagged into a tree due to overload 
and tripped out. 

♦ Within a minute, the 345 kV to 138 kV trans¬ 
former at Pleasant Valley overloaded and trip¬ 
ped off, leaving Consolidated Edison with only 
three remaining interconnections. 

♦ Within 3 minutes, the Long Island Lighting Co. 
system operator, on concurrence of the pool dis¬ 
patcher, manually opened the Jamaica to Valley 
Stream tie. 

♦ About 7 minutes later, the tap-changing mecha¬ 
nism failed on the Goethals phase-shifter, 
resulting in the loss of the Linden-to-Goethals 
tie to PJM, which was carrying 1,150 MW to 
Consolidated Edison. 

♦ The two remaining external 138-kV ties to Con¬ 
solidated Edison tripped on overload, isolating 
the Consolidated Edison system. 

♦ Insufficient generation in the isolated system 
caused the Consolidated Edison island to 
collapse. 


December 22, 1982: West Coast Blackout 

This disturbance resulted in the loss of 12,350 
MW of load and affected over 5 million people in 
the West. The outage began when high winds 
caused the failure of a 500-kV transmission tower. 
The tower fell into a parallel 500-kV line tower, 
and both lines were lost. The failure of these two 
lines mechanically cascaded and caused three 
additional towers to fail on each line. When the 
line conductors fell they contacted two 230-kV 
lines crossing under the 500-kV rights-of-way, col¬ 
lapsing the 230-kV lines. 

The loss of the 500-kV lines activated a remedial 
action scheme to control the separation of the 
interconnection into two pre-engineered islands 
and trip generation in the Pacific Northwest in 
order to minimize customer outages and speed 
restoration. However, delayed operation of the 
remedial action scheme components occurred for 
several reasons, and the interconnection sepa¬ 
rated into four islands. 

In addition to the mechanical failure of the trans¬ 
mission lines, analysis of this outage cited prob¬ 
lems with coordination of protective schemes, 
because the generator tripping and separation 
schemes operated slowly or did not operate as 
planned. A communication channel component 
performed sporadically, resulting in delayed 
transmission of the control signal. The backup 
separation scheme also failed to operate, because 
the coordination of relay settings did not antici¬ 
pate the power flows experienced in this severe 
disturbance. 

In addition, the volume and format in which data 
were displayed to operators made it difficult to 
assess the extent of the disturbance and what cor¬ 
rective action should be taken. Time references to 
events in this disturbance were not tied to a com¬ 
mon standard, making real-time evaluation of the 
situation more difficult. 

July 2-3, 1996: West Coast Blackout 

This disturbance resulted in the loss of 11,850 
MW of load and affected 2 million people in the 
West. Customers were affected in Arizona, Cali¬ 
fornia, Colorado, Idaho, Montana, Nebraska, 
Nevada, New Mexico, Oregon, South Dakota, 
Texas, Utah, Washington, and Wyoming in the 
United States; Alberta and British Columbia in 
Canada; and Baja California Norte in Mexico. Out¬ 
ages lasted from a few minutes to several hours. 
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The outage began when a 345-kV transmission 
line in Idaho sagged into a tree and tripped out. A 
protective relay on a parallel transmission line 
also detected the fault and incorrectly tripped a 
second line. An almost simultaneous loss of these 
lines greatly reduced the ability of the system to 
transmit power from the nearby Jim Bridger plant. 
Other relays tripped two of the four generating 
units at that plant. With the loss of those two 
units, frequency in the entire Western Intercon¬ 
nection began to decline, and voltage began to col¬ 
lapse in the Boise, Idaho, area, affecting the 
California-Oregon AC Intertie transfer limit. 

For 23 seconds the system remained in precarious 
balance, until the Mill Creek to Antelope 230-kV 
line between Montana and Idaho tripped by zone 
3 relay, depressing voltage at Summer Lake Sub¬ 
station and causing the intertie to slip out of syn¬ 
chronism. Remedial action relays separated the 
system into five pre-engineered islands designed 
to minimize customer outages and restoration 
times. Similar conditions and initiating factors 
were present on July 3; however, as voltage began 
to collapse in the Boise area, the operator shed 
load manually and contained the disturbance. 

August 10, 1996: West Coast Blackout 

This disturbance resulted in the loss of over 
28,000 MW of load and affected 7.5 million people 
in the West. Customers were affected in Arizona, 
California, Colorado, Idaho, Montana, Nebraska, 
Nevada, New Mexico, Oregon, South Dakota, 
Texas, Utah, Washington, and Wyoming in the 
United States; Alberta and British Columbia in 
Canada; and Baja California Norte in Mexico. Out¬ 
ages lasted from a few minutes to as long as nine 
hours. 

Triggered by several major transmission line out¬ 
ages, the loss of generation from McNary Dam, and 
resulting system oscillations, the Western Inter¬ 
connection separated into four electrical islands, 
with significant loss of load and generation. Prior 
to the disturbance, the transmission system from 
Canada south through the Northwest into Califor¬ 
nia was heavily loaded with north-to-south power 
transfers. These flows were due to high Southwest 
demand caused by hot weather, combined with 
excellent hydroelectric conditions in Canada and 
the Northwest. 

Very high temperatures in the Northwest caused 
two lightly loaded transmission lines to sag into 
untrimmed trees and trip out. A third heavily 
loaded line also sagged into a tree. Its outage led to 


the overload and loss of additional transmission 
lines. General voltage decline in the Northwest 
and the loss of McNary generation due to incor¬ 
rectly applied relays caused power oscillations on 
the California to Oregon AC intertie. The intertie’s 
protective relays tripped these facilities out and 
caused the Western Interconnection to separate 
into four islands. Following the loss of the first two 
lightly loaded lines, operators were unaware that 
the system was in an insecure state over the next 
hour, because new operating studies had not been 
performed to identify needed system adjustments. 

June 25, 1998: Upper Midwest Blackout 

This disturbance resulted in the loss of 950 MW of 
load and affected 152,000 people in Minnesota, 
Montana, North Dakota, South Dakota, and Wis¬ 
consin in the United States; and Ontario, Mani¬ 
toba, and Saskatchewan in Canada. Outages lasted 
up to 19 hours. 

A lightning storm in Minnesota initiated a series of 
events, causing a system disturbance that affected 
the entire Mid-Continent Area Power Pool (MAPP) 
Region and the northwestern Ontario Hydro sys¬ 
tem of the Northeast Power Coordinating Council. 
A 345-kV line was struck by lightning and tripped 
out. Underlying lower voltage lines began to over¬ 
load and trip out, further weakening the system. 
Soon afterward, lightning struck a second 345-kV 
line, taking it out of service as well. Following the 
outage of the second 345-kV line, the remaining 
lower voltage transmission lines in the area 
became significantly overloaded, and relays took 
them out of service. This cascading removal of 
lines from service continued until the entire 
northern MAPP Region was separated from the 
Eastern Interconnection, forming three islands 
and resulting in the eventual blackout of the 
northwestern Ontario Hydro system. 

Summer of 1999: Northeast U.S. 
Non-outage Disturbances 

Load in the PJM system on July 6, 1999, was 
51,600 MW (approximately 5,000 MW above fore¬ 
cast). PJM used all emergency procedures (includ¬ 
ing a 5% voltage reduction) except manually 
tripping load, and imported 5,000 MW from exter¬ 
nal systems to serve the record customer demand. 
Load on July 19, 1999, exceeded 50,500 MW. PJM 
loaded all available eastern PJM generation and 
again implemented emergency operating proce¬ 
dures from approximately 12 noon into the eve¬ 
ning on both days. 
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During these record peak loads, steep voltage 
declines were experienced on the bulk transmis¬ 
sion system. In each case, a voltage collapse was 
barely averted through the use of emergency pro¬ 
cedures. Low voltage occurred because reactive 
demand exceeded reactive supply. High reactive 
demand was due to high electricity demand and 
high losses resulting from high transfers across the 
system. Reactive supply was inadequate because 
generators were unavailable or unable to meet 
rated reactive capability due to ambient condi¬ 
tions, and because some shunt capacitors were out 
of service. 

Common or Similar Factors 
Among Major Outages 

The factors that were common to some of the 
major outages above and the August 14 blackout 
include: (1) conductor contact with trees; (2) over¬ 
estimation of dynamic reactive output of system 
generators; (3) inability of system operators or 
coordinators to visualize events on the entire sys¬ 
tem; (4) failure to ensure that system operation 
was within safe limits; (5) lack of coordination on 
system protection; (6) ineffective communication; 
(7) lack of “safety nets;” and (8) inadequate train¬ 
ing of operating personnel. The following sections 
describe the nature of these factors and list recom¬ 
mendations from previous investigations that are 
relevant to each. 

Conductor Contact With Trees 

This factor was an initiating trigger in several of 
the outages and a contributing factor in the sever¬ 
ity of several more. Unlike lightning strikes, for 
which system operators have fair storm-tracking 
tools, system operators generally do not have 
direct knowledge that a line has contacted a tree 
and faulted. They will sometimes test the line by 
trying to restore it to service, if that is deemed to be 
a safe operation. Even if it does go back into ser¬ 
vice, the line may fault and trip out again as load 
heats it up. This is most likely to happen when 
vegetation has not been adequately managed, in 
combination with hot and windless conditions. 

In some of the disturbances, tree contact account¬ 
ed for the loss of more than one circuit, contribut¬ 
ing multiple contingencies to the weakening of 
the system. Lines usually sag into right-of-way 
obstructions when the need to retain transmission 
interconnection is high. High inductive load 
composition, such as air conditioning or irrigation 


pumping, accompanies hot weather and places 
higher burdens on transmission lines. Losing cir¬ 
cuits contributes to voltage decline. Inductive 
load is unforgiving when voltage declines, draw¬ 
ing additional reactive supply from the system 
and further contributing to voltage problems. 

Recommendations from previous investigations 
include: 

♦ Paying special attention to the condition of 
rights-of-way following favorable growing sea¬ 
sons. Very wet and warm spring and summer 
growing conditions preceded the 1996 outages 
in the West. 

♦ Careful review of any reduction in operations 
and maintenance expenses that may contribute 
to decreased frequency of line patrols or trim¬ 
ming. Maintenance in this area should be 
strongly directed toward preventive rather than 
remedial maintenance. 

Dynamic Reactive Output of Generators 

Reactive supply is an important ingredient in 
maintaining healthy power system voltages and 
facilitating power transfers. Inadequate reactive 
supply was a factor in most of the events. Shunt 
capacitors and generating resources are the most 
significant suppliers of reactive power. Operators 
perform contingency analysis based on how 
power system elements will perform under vari¬ 
ous power system conditions. They determine and 
set transfer limits based on these analyses. Shunt 
capacitors are easy to model because they are 
static. Modeling the dynamic reactive output of 
generators under stressed system conditions has 
proven to be more challenging. If the model is 
incorrect, estimated transfer limits will also be 
incorrect. 

In most of the events, the assumed contribution of 
dynamic reactive output of system generators was 
greater than the generators actually produced, 
resulting in more significant voltage problems. 
Some generators were limited in the amount of 
reactive power they produced by over-excitation 
limits, or necessarily derated because of high 
ambient temperatures. Other generators were con¬ 
trolled to a fixed power factor and did not contrib¬ 
ute reactive supply in depressed voltage 
conditions. Under-voltage load shedding is em¬ 
ployed as an automatic remedial action in some 
interconnections to prevent cascading, and could 
be used more widely. 
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Recommendations from previous investigations 
concerning voltage support and reactive power 
management include: 

♦ Communicate changes to generator reactive 
capability limits in a timely and accurate man¬ 
ner for both planning and operational modeling 
purposes. 

♦ Investigate the development of a generator 
MVAr/voltage monitoring process to determine 
when generators may not be following reported 
MVAr limits. 

♦ Establish a common standard for generator 
steady-state and post-contingency (15-minute) 
MVAr capability definition; determine method¬ 
ology, testing, and operational reporting 
requirements. 

♦ Determine the generator service level agree¬ 
ment that defines generator MVAr obligation to 
help ensure reliable operations. 

♦ Periodically review and field test the reactive 
limits of generators to ensure that reported 
MVAr limits are attainable. 

♦ Provide operators with on-line indications of 
available reactive capability from each generat¬ 
ing unit or groups of generators, other VAr 
sources, and the reactive margin at all critical 
buses. This information should assist in the 
operating practice of maximizing the use of 
shunt capacitors during heavy transfers and 
thereby increase the availability of system 
dynamic reactive reserve. 

♦ For voltage instability problems, consider fast 
automatic capacitor insertion (both series and 
shunt), direct shunt reactor and load tripping, 
and under-voltage load shedding. 

♦ Develop and periodically review a reactive mar¬ 
gin against which system performance should 
be evaluated and used to establish maximum 
transfer levels. 

System Visibility Procedures and 
Operator Tools 

Each control area operates as part of a single syn¬ 
chronous interconnection. However, the parties 
with various geographic or functional responsibil¬ 
ities for reliable operation of the grid do not have 
visibility of the entire system. Events in neighbor¬ 
ing systems may not be visible to an operator or 
reliability coordinator, or power system data 
may be available in a control center but not be 


presented to operators or coordinators as informa¬ 
tion they can use in making appropriate operating 
decisions. 

Recommendations from previous investigations 
concerning visibility and tools include: 

♦ Develop communications systems and displays 
that give operators immediate information on 
changes in the status of major components in 
their own and neighboring systems. 

♦ Supply communications systems with uninter¬ 
ruptible power, so that information on system 
conditions can be transmitted correctly to con¬ 
trol centers during system disturbances. 

♦ In the control center, use a dynamic line loading 
and outage display board to provide operating 
personnel with rapid and comprehensive infor¬ 
mation about the facilities available and the 
operating condition of each facility in service. 

♦ Give control centers the capability to display to 
system operators computer-generated alterna¬ 
tive actions specific to the immediate situation, 
together with expected results of each action. 

♦ Establish on-line security analysis capability to 
identify those next and multiple facility outages 
that would be critical to system reliability from 
thermal, stability, and post-contingency voltage 
points of view. 

♦ Establish time-synchronized disturbance moni¬ 
toring to help evaluate the performance of the 
interconnected system under stress, and design 
appropriate controls to protect it. 

System Operation Within Safe Limits 

Operators in several of the events were unaware of 
the vulnerability of the system to the next contin¬ 
gency. The reasons were varied: inaccurate model¬ 
ing for simulation, no visibility of the loss of key 
transmission elements, no operator monitoring of 
stability measures (reactive reserve monitor, 
power transfer angle), and no reassessment of sys¬ 
tem conditions following the loss of an element 
and readjustment of safe limits. 

Recommendations from previous investigations 
include: 

♦ Following a contingency, the system must be 
returned to a reliable state within the allowed 
readjustment period. Operating guides must be 
reviewed to ensure that procedures exist to 
restore system reliability in the allowable time 
periods. 
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♦ Reduce scheduled transfers to a safe and pru¬ 
dent level until studies have been conducted to 
determine the maximum simultaneous transfer 
capability limits. 

♦ Reevaluate processes for identifying unusual 
operating conditions and potential disturbance 
scenarios, and make sure they are studied 
before they are encountered in real-time operat¬ 
ing conditions. 

Coordination of System Protection 
(Transmission and Generation Elements) 

Protective relays are designed to detect short cir¬ 
cuits and act locally to isolate faulted power sys¬ 
tem equipment from the system—both to protect 
the equipment from damage and to protect the sys¬ 
tem from faulty equipment. Relay systems are 
applied with redundancy in primary and backup 
modes. If one relay fails, another should detect the 
fault and trip appropriate circuit breakers. Some 
backup relays have significant “reach,” such that 
non-faulted line overloads or stable swings may be 
seen as faults and cause the tripping of a line when 
it is not advantageous to do so. Proper coordina¬ 
tion of the many relay devices in an intercon¬ 
nected system is a significant challenge, requiring 
continual review and revision. Some relays can 
prevent resynchronizing, making restoration more 
difficult. 

System-wide controls protect the interconnected 
operation rather than specific pieces of equip¬ 
ment. Examples include controlled islanding to 
mitigate the severity of an inevitable disturbance 
and under-voltage or under-frequency load shed¬ 
ding. Failure to operate (or misoperation of) one or 
more relays as an event developed was a common 
factor in several of the disturbances. 

Recommendations developed after previous out¬ 
ages include: 

♦ Perform system trip tests of relay schemes peri¬ 
odically. At installation the acceptance test 
should be performed on the complete relay 
scheme in addition to each individual compo¬ 
nent so that the adequacy of the scheme is 
verified. 

♦ Continually update relay protection to fit 
changing system development and to incorpo¬ 
rate improved relay control devices. 

♦ Install sensing devices on critical transmission 
lines to shed load or generation automatically if 
the short-term emergency rating is exceeded for 


a specified period of time. The time delay 
should be long enough to allow the system oper¬ 
ator to attempt to reduce line loadings promptly 
by other means. 

♦ Review phase-angle restrictions that can pre¬ 
vent reclosing of major interconnections during 
system emergencies. Consideration should be 
given to bypassing synchronism-check relays to 
permit direct closing of critical interconnec¬ 
tions when it is necessary to maintain stability 
of the grid during an emergency. 

♦ Review the need for controlled islanding. Oper¬ 
ating guides should address the potential for 
significant generation/load imbalance within 
the islands. 

Effectiveness of Communications 

Under normal conditions, parties with reliability 
responsibility need to communicate important 
and prioritized information to each other in a 
timely way, to help preserve the integrity of the 
grid. This is especially important in emergencies. 
During emergencies, operators should be relieved 
of duties unrelated to preserving the grid. A com¬ 
mon factor in several of the events described 
above was that information about outages occur¬ 
ring in one system was not provided to neighbor¬ 
ing systems. 

Need for Safety Nets 

A safety net is a protective scheme that activates 
automatically if a pre-specified, significant con¬ 
tingency occurs. When activated, such schemes 
involve certain costs and inconvenience, but they 
can prevent some disturbances from getting out of 
control. These plans involve actions such as shed¬ 
ding load, dropping generation, or islanding, and 
in all cases the intent is to have a controlled out¬ 
come that is less severe than the likely uncon¬ 
trolled outcome. If a safety net had not been taken 
out of service in the West in August 1996, it would 
have lessened the severity of the disturbance from 
28,000 MW of load lost to less than 7,200 MW. (It 
has since been returned to service.) Safety nets 
should not be relied upon to establish transfer lim¬ 
its, however. 

Previous recommendations concerning safety nets 
include: 

♦ Establish and maintain coordinated programs 
of automatic load shedding in areas not so 
equipped, in order to prevent total loss of power 
in an area that has been separated from the 
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main network and is deficient in generation. 
Load shedding should be regarded as an insur¬ 
ance program, however, and should not be used 
as a substitute for adequate system design. 

♦ Install load-shedding controls to allow fast sin¬ 
gle-action activation of large-block load shed¬ 
ding by an operator. 

Training of Operating Personnel 

Operating procedures were necessary but not suf¬ 
ficient to deal with severe power system distur¬ 
bances in several of the events. Enhanced 
procedures and training for operating personnel 
were recommended. Dispatcher training facility 
scenarios with disturbance simulation were sug¬ 
gested as well. Operators tended to reduce sched¬ 
ules for transactions but were reluctant to call 
for increased generation—or especially to shed 
load—in the face of a disturbance that threatened 
to bring the whole system down. 

Previous recommendations concerning training 
include: 

♦ Thorough programs and schedules for operator 
training and retraining should be vigorously 
administered. 

♦ A full-scale simulator should be made available 
to provide operating personnel with “hands-on” 
experience in dealing with possible emergency 
or other system conditions. 

♦ Procedures and training programs for system 
operators should include anticipation, recogni¬ 
tion, and definition of emergency situations. 

♦ Written procedures and training materials 
should include criteria that system operators 
can use to recognize signs of system stress and 
mitigating measures to be taken before condi¬ 
tions degrade into emergencies. 

♦ Line loading relief procedures should not be 
relied upon when the system is in an insecure 
state, as these procedures cannot be imple¬ 
mented effectively within the required time 


frames in many cases. Other readjustments 
must be used, and the system operator must 
take responsibility to restore the system 
immediately. 

♦ Operators’ authority and responsibility to take 
immediate action if they sense the system is 
starting to degrade should be emphasized and 
protected. 

♦ The current processes for assessing the poten¬ 
tial for voltage instability and the need to 
enhance the existing operator training pro¬ 
grams, operational tools, and annual technical 
assessments should be reviewed to improve the 
ability to predict future voltage stability prob¬ 
lems prior to their occurrence, and to mitigate 
the potential for adverse effects on a regional 
scale. 

Comparisons With the 
August 14 Blackout 

The blackout on August 14, 2003, had several 
causes or contributory factors in common with the 
earlier outages, including: 

♦ Inadequate vegetation management 

♦ Failure to ensure operation within secure limits 

♦ Failure to identify emergency conditions and 
communicate that status to neighboring 
systems 

♦ Inadequate operator training 

♦ Inadequate regional-scale visibility over the 
power system 

♦ Inadequate coordination of relays and other 
protective devices or systems. 

New causal features of the August 14 blackout 
include: inadequate interregional visibility over 
the power system; dysfunction of a control area’s 
SCADA/EMS system; and lack of adequate backup 
capability to that system. 
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8. Performance of Nuclear Power Plants 
Affected by the Blackout 


Introduction 

On August 14, 2003, nine U.S. nuclear power 
plants experienced rapid shutdowns (reactor 
trips) as a consequence of the power outage. Seven 
nuclear power plants in Canada operating at high 
power levels at the time of the event also experi¬ 
enced rapid shutdowns. Four other Canadian 
nuclear plants automatically disconnected from 
the grid due to the electrical transient but were 
able to continue operating at a reduced power 
level and were available to supply power to the 
grid as it was restored by the transmission system 
operators. Six nuclear plants in the United States 
and one in Canada experienced significant electri¬ 
cal disturbances but were able to continue gener¬ 
ating electricity. Many non-nuclear generating 
plants in both countries also tripped during the 
event. Numerous other nuclear plants observed 
disturbances on the electrical grid but continued 
to generate electrical power without interruption. 

The Nuclear Working Group (NWG) was one of 
three Working Groups created to support the 
U.S.-Canada Power System Outage Task Force. 
The NWG was charged with identifying all rele¬ 
vant actions by nuclear generating facilities in 
connection with the outage. Nils Diaz, Chairman 
of the U.S. Nuclear Regulatory Commission (NRC) 
and Linda Keen, President and CEO of the Cana¬ 
dian Nuclear Safety Commission (CNSC) were 
co-chairs of the Working Group, with other mem¬ 
bers appointed from industry and various State 
and federal agencies. 

In Phase I, the NWG focused on collecting and 
analyzing data from each affected nuclear power 
plant to determine what happened, and whether 
any activities at the plants caused or contributed 
to the power outage or involved a significant 
safety issue. Phase I culminated in the issuance of 
the Task Force’s Interim Report, which reported 
that: 

♦ The affected nuclear power plants did not 
trigger the power outage or inappropriately 


contribute to its spread (i.e., to an extent beyond 
the normal tripping of the plants at expected 
conditions). 

♦ The severity of the grid transient caused genera¬ 
tors, turbines, or reactor systems at the nuclear 
plants to reach protective feature limits and 
actuate automatic protective actions. 

♦ The nuclear plants responded to the grid condi¬ 
tions in a manner consistent with the plant 
designs. 

♦ The nuclear plants were maintained in a safe 
condition until conditions were met to permit 
the nuclear plants to resume supplying electri¬ 
cal power to the grid. 

♦ For nuclear plants in the United States: 

>■ Fermi 2, Oyster Creek, and Perry tripped due 
to main generator trips, which resulted from 
voltage and frequency fluctuations on the 
grid. Nine Mile 1 tripped due to a main tur¬ 
bine trip due to frequency fluctuations on the 
grid. 

>■ FitzPatrick and Nine Mile 2 tripped due to 
reactor trips, which resulted from turbine 
control system low pressure due to frequency 
fluctuations on the grid. Ginna tripped due to 
a reactor trip which resulted from a large loss 
of electrical load due to frequency fluctua¬ 
tions on the grid. Indian Point 2 and Indian 
Point 3 tripped due to a reactor trip on low 
flow, which resulted when low grid fre¬ 
quency tripped reactor coolant pumps. 

♦ For nuclear plants in Canada: 

>■ At Bruce B and Pickering B, frequency and/or 
voltage fluctuations on the grid resulted in 
the automatic disconnection of generators 
from the grid. For those units that were suc¬ 
cessful in maintaining the unit generators 
operational, reactor power was automatically 
reduced. 
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>> At Darlington, load swing on the grid led to 
the automatic reduction in power of the four 
reactors. The generators were, in turn, auto¬ 
matically disconnected from the grid. 

>■ Three reactors at Bruce B and one at Darling¬ 
ton were returned to 60% power. These reac¬ 
tors were available to deliver power to the 
grid on the instructions of the transmission 
system operator. 

>> Three units at Darlington were placed in a 
zero-power hot state, and four units at 
Pickering B and one unit at Bruce B were 
placed in a Guaranteed Shutdown State. 

The licensees’ return to power operation followed 
a deliberate process controlled by plant proce¬ 
dures and regulations. Equipment and process 
problems, whether existing prior to or caused by 
the event, would normally be addressed prior to 
restart. The NWG is satisfied that licensees took an 
appropriately conservative approach to their 
restart activities, placing a priority on safety. 

♦ For U.S. nuclear plants: Ginna, Indian Point 2, 
Nine Mile 2, and Oyster Creek resumed electri¬ 
cal generation on August 17. FitzPatrick and 
Nine Mile 1 resumed electrical generation on 
August 18. Fermi 2 resumed electrical genera¬ 
tion on August 20. Perry resumed electrical gen¬ 
eration on August 21. Indian Point 3 resumed 
electrical generation on August 22. Indian Point 
3 had equipment issues (failed splices in the 
control rod drive mechanism power system) 
that required repair prior to restart. Ginna 
submitted a special request for enforcement dis¬ 
cretion from the NRC to permit mode changes 
and restart with an inoperable auxiliary 
feedwater pump. The NRC granted the request 
for enforcement discretion. 

♦ For Canadian nuclear plants: The restart of the 
Canadian nuclear plants was carried out in 
accordance with approved Operating Policies 
and Principles. Three units at Bruce B and one 
at Darlington were resynchronized with the grid 
within 6 hours of the event. The remaining 
three units at Darlington were reconnected by 
August 17 and 18. Units 5, 6, and 8 at Pickering 
B and Unit 6 at Bruce B returned to service 
between August 22 and August 25. 

The NWG has found no evidence that the shut¬ 
down of the nuclear power plants triggered the 
outage or inappropriately contributed to its spread 
(i.e., to an extent beyond the normal tripping of 
the plants at expected conditions). All the nuclear 


plants that shut down or disconnected from the 
grid responded automatically to grid conditions. 
All the nuclear plants responded in a manner con¬ 
sistent with the plant designs. Safety functions 
were effectively accomplished, and the nuclear 
plants that tripped were maintained in a safe shut¬ 
down condition until their restart. 

In Phase II, the NWG collected comments and ana¬ 
lyzed information related to potential recommen¬ 
dations to help prevent future power outages. 
Representatives of the NWG, including represen¬ 
tatives of the NRC and the CNSC, attended public 
meetings to solicit feedback and recommenda¬ 
tions held in Cleveland, Ohio; New York City, 
New York; and Toronto, Ontario, on December 4, 
5, and 8, 2003, respectively. Representatives of the 
NWG also participated in the NRC’s public meet¬ 
ing to solicit feedback and recommendations on 
the Northeast blackout held in Rockville, Mary¬ 
land, on January 6, 2004. 

Additional details on both the Phase I and Phase II 
efforts are available in the following sections. Due 
to the major design differences between nuclear 
plants in Canada and the United States, the NWG 
decided to have separate sections for each coun¬ 
try. This also responds to the request by the 
nuclear regulatory agencies in both countries to 
have sections of the report that stand alone, so that 
they can also be used as regulatory documents. 

Findings of the U.S. Nuclear 
Working Group 

Summary 

The U.S. NWG found no evidence that the shut¬ 
down of the nine U.S. nuclear power plants trig¬ 
gered the outage, or inappropriately contributed to 
its spread (i.e., to an extent beyond the normal 
tripping of the plants at expected conditions). All 
nine plants that experienced a reactor trip were 
responding to grid conditions. The severity of the 
grid transient caused generators, turbines, or reac¬ 
tor systems at the plants to reach a protective fea¬ 
ture limit and actuate a plant shutdown. All nine 
plants tripped in response to those conditions in a 
manner consistent with the plant designs. The 
nine plants automatically shut down in a safe 
fashion to protect the plants from the grid tran¬ 
sient. Safety functions were effectively accom¬ 
plished with few problems, and the plants were 
maintained in a safe shutdown condition until 
their restart. 
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The nuclear power plant outages that resulted 
from the August 14, 2003, power outage were trig¬ 
gered by automatic protection systems for the 
reactors or turbine-generators, not by any manual 
operator actions. The NWG has received no infor¬ 
mation that points to operators deliberately shut¬ 
ting down nuclear units to isolate themselves from 
instabilities on the grid. In short, only automatic 
separation of nuclear units occurred. 

Regarding the 95 other licensed commercial 
nuclear power plants in the United States: 4 were 
already shut down at the time of the power outage, 
one of which experienced a grid disturbance; 70 
operating plants observed some level of grid dis¬ 
turbance but accommodated the disturbances and 
remained on line, supplying power to the grid; and 
21 operating plants did not experience any grid 
disturbance. 

Introduction 

The NRC, which regulates U.S. commercial 
nuclear power plants, has regulatory requirements 
for offsite power systems. These requirements 
address the number of offsite power sources and 
the ability to withstand certain transients. Offsite 
power is the normal source of alternating current 
(AC) power to the safety systems in the plants 
when the plant main generator is not in operation. 
The requirements also are designed to protect 
safety systems from potentially damaging varia¬ 
tions (in voltage and frequency) in the supplied 
power. For loss of offsite power events, the NRC 
requires emergency generation (typically emer¬ 
gency diesel generators) to provide AC power to 
safety systems. In addition, the NRC provides 
oversight of the safety aspects of offsite power 
issues through its inspection program, by moni¬ 
toring operating experience, and by performing 
technical studies. 

Phase I: Fact Finding 

Phase I of the NWG effort focused on collecting 
and analyzing data from each plant to determine 
what happened, and whether any activities at the 
plants caused or contributed to the power outage 
or its spread or involved a significant safety issue. 
To ensure accuracy, comprehensive coordination 
was maintained among the working group mem¬ 
bers and among the NWG, ESWG, and SWG. 

The staff developed a set of technical questions to 
obtain data from the owners or licensees of the 
nuclear power plants that would enable them to 
review the response of the nuclear plant systems 


in detail. Two additional requests for more spe¬ 
cific information were made for certain plants. 
The collection of information from U.S. nuclear 
power plants was gathered through the NRC 
regional offices, which had NRC resident inspec¬ 
tors at each plant obtain licensee information to 
answer the questions. General design information 
was gathered from plant-specific Updated Final 
Safety Analysis Reports and other documents. 

Plant data were compared against plant designs by 
the NRC staff to determine whether the plant 
responses were as expected; whether they 
appeared to cause the power outage or contributed 
to the spread of the outage; and whether applica¬ 
ble safety requirements were met. In some cases 
supplemental questions were developed, and 
answers were obtained from the licensees to clar¬ 
ify the observed response of the plant. The NWG 
interfaced with the ESWG to validate some data 
and to obtain grid information, which contributed 
to the analysis. The NWG identified relevant 
actions by nuclear generating facilities in connec¬ 
tion with the power outage. 

Typical Design, Operational, and 
Protective Features of U.S. Nuclear 
Power Plants 

Nuclear power plants have a number of design, 
operational, and protective features to ensure that 
the plants operate safely and reliably. This section 
describes these features so as to provide a better 
understanding of how nuclear power plants inter¬ 
act with the grid and, specifically, how nuclear 
power plants respond to changing grid conditions. 
While the features described in this section are 
typical, there are differences in the design and 
operation of individual plants which are not 
discussed. 

Design Features of U.S. Nuclear Power Plants 

Nuclear power plants use heat from nuclear reac¬ 
tions to generate steam and use a single steam- 
driven turbine-generator (also known as the main 
generator) to produce electricity supplied to the 
grid. 

Connection of the plant switchyard to the grid. 

The plant switchyard normally forms the interface 
between the plant main generator and the electri¬ 
cal grid. The plant switchyard has multiple trans¬ 
mission lines connected to the grid system to meet 
offsite power supply requirements for having reli¬ 
able offsite power for the nuclear station under 
all operating and shutdown conditions. Each 
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transmission line connected to the switchyard has 
dedicated circuit breakers, with fault sensors, to 
isolate faulted conditions in the switchyard or the 
connected transmission lines, such as phase-to- 
phase or phase-to-ground short circuits. The fault 
sensors are fed into a protection scheme for the 
plant switchyard that is engineered to localize 
any faulted conditions with minimum system 
disturbance. 

Connection of the main generator to the switch¬ 
yard. The plant main generator produces electri¬ 
cal power and transmits that power to the offsite 
transmission system. Most plants also supply 
power to the plant auxiliary buses for normal 
operation of the nuclear generating unit through 
the unit auxiliary transformer. During normal 
plant operation, the main generator typically gen¬ 
erates electrical power at about 22 kV. The voltage 
is increased to match the switchyard voltage by 
the main transformers, and the power flows to the 
high voltage switchyard through two power cir¬ 
cuit breakers. 

Power supplies for the plant auxiliary buses. The 

safety-related and nonsafety auxiliary buses are 
normally lined up to receive power from the main 
generator auxiliary transformer, although some 
plants leave some of their auxiliary buses powered 
from a startup transformer (that is, from the offsite 
power distribution system). When plant power 
generation is interrupted, the power supply auto¬ 
matically transfers to the offsite power source (the 
startup transformer). If that is not supplying 
acceptable voltage, the circuit breakers to the 
safety-related buses open, and the buses are 
reenergized by the respective fast-starting emer¬ 
gency diesel generators. The nonsafety auxiliary 
buses will remain deenergized until offsite power 
is restored. 

Operational Features of U.S. Nuclear Power 
Plants 

Response of nuclear power plants to changes in 
switchyard voltage. With the main generator volt¬ 
age regulator in the automatic mode, the generator 
will respond to an increase of switchyard voltage 
by reducing the generator field excitation current. 
This will result in a decrease of reactive power, 
normally measured as mega-volts-amperes-reac- 
tive (MVAr) from the generator to the switchyard 
and out to the surrounding grid, helping to control 
the grid voltage increase. With the main generator 
voltage regulator in the automatic mode, the gen¬ 
erator will respond to a decrease of switchyard 
voltage by increasing the generator field excitation 
current. This will result in an increase of reactive 


power (MVAr) from the generator to the 
switchyard and out to the surrounding grid, help¬ 
ing to control the grid voltage decrease. If the 
switchyard voltage goes low enough, the 
increased generator field current could result in 
generator field overheating. Over-excitation pro¬ 
tective circuitry is generally employed to prevent 
this from occurring. This protective circuitry may 
trip the generator to prevent equipment damage. 

Under-voltage protection is provided for the 
nuclear power plant safety buses, and may be pro¬ 
vided on nonsafety buses and at individual pieces 
of equipment. It is also used in some pressurized 
water reactor designs on reactor coolant pumps 
(RCPs) as an anticipatory loss of RCP flow signal. 

Protective Features of U.S. Nuclear Power 
Plants 

The main generator and main turbine have protec¬ 
tive features, similar to fossil generating stations, 
which protect against equipment damage. In gen¬ 
eral, the reactor protective features are designed to 
protect the reactor fuel from damage and to protect 
the reactor coolant system from over-pressure or 
over-temperature transients. Some trip features 
also produce a corresponding trip in other compo¬ 
nents; for example, a turbine trip typically results 
in a reactor trip above a low power setpoint. 

Generator protective features typically include 
over-current, ground detection, differential relays 
(which monitor for electrical fault conditions 
within a zone of protection defined by the location 
of the sensors, typically the main generator and all 
transformers connected directly to the generator 
output), electrical faults on the transformers con¬ 
nected to the generator, loss of the generator field, 
and a turbine trip. Turbine protective features typ¬ 
ically include over-speed (usually set at 1980 rpm 
or 66 Hz), low bearing oil pressure, high bearing 
vibration, degraded condenser vacuum, thrust 
bearing failure, or generator trip. Reactor protec¬ 
tive features typically include trips for over¬ 
power, abnormal pressure in the reactor coolant 
system, low reactor coolant system flow, low level 
in the steam generators or the reactor vessel, or a 
trip of the turbine. 

Considerations on Returning a U.S. 
Nuclear Power Plant to Power 
Production After Switchyard Voltage 
Is Restored 

The following are examples of the types of activi¬ 
ties that must be completed before returning a 
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nuclear power plant to power production follow¬ 
ing a loss of switchyard voltage. 

♦ Switchyard voltage must be normal and stable 
from an offsite supply. Nuclear power plants are 
not designed for black-start capability (the abil¬ 
ity to start up without external power). 

♦ Plant buses must be energized from the 
switchyard and the emergency diesel genera¬ 
tors restored to standby mode. 

♦ Normal plant equipment, such as reactor cool¬ 
ant pumps and circulating water pumps, must 
be restarted. 

♦ A reactor trip review report must be completed 
and approved by plant management, and the 
cause of the trip must be addressed. 

♦ All plant technical specifications must be satis¬ 
fied. Technical specifications are issued to each 
nuclear power plant as part of their license by 
the NRC. They dictate equipment which must 
be operable and process parameters which must 
be met to allow operation of the reactor. Exam¬ 
ples of actions that were required following the 
events of August 14 include refilling the diesel 
fuel oil storage tanks, refilling the condensate 
storage tanks, establishing reactor coolant sys¬ 
tem forced flow, and cooling the suppression 
pool to normal operating limits. Surveillance 
tests must be completed as required by techni¬ 
cal specifications (for example, operability of 
the low-range neutron detectors must be 
demonstrated). 

♦ Systems must be aligned to support the startup. 

♦ Pressures and temperatures for reactor startup 
must be established in the reactor coolant sys¬ 
tem for pressurized water reactors. 

♦ A reactor criticality calculation must be per¬ 
formed to predict the control rod withdrawals 
needed to achieve criticality, where the fission 
chain reaction becomes self-sustaining due to 
the increased neutron flux. Certain neutron¬ 
absorbing fission products increase in concen¬ 
tration following a reactor trip (followed later 
by a decrease or decay). At pressurized water 
reactors, the boron concentration in the primary 
coolant must be adjusted to match the criticality 
calculation. Near the end of the fuel cycle, the 
nuclear power plant may not have enough 
boron adjustment or control rod worth available 
for restart until the neutron absorbers have 


decreased significantly (more than 24 hours 
after the trip). 

It may require a day or more before a nuclear 
power plant can restart following a normal trip. 
Plant trips are a significant transient on plant 
equipment, and some maintenance may be neces¬ 
sary before the plant can restart. When combined 
with the infrequent event of loss of offsite power, 
additional recovery actions will be required. 
Safety systems, such as emergency diesel genera¬ 
tors and safety-related decay heat removal sys¬ 
tems, must be restored to normal lineups. These 
additional actions would extend the time neces¬ 
sary to restart a nuclear plant from this type of 
event. 

Summary of U.S. Nuclear Power Plant 
Response to and Safety During the 
August 14 Outage 

The NWG’s review did not identify any activity or 
equipment issues at U.S. nuclear power plants 
that caused the transient on August 14, 2003. Nine 
nuclear power plants tripped within about 60 sec¬ 
onds as a result of the grid disturbance. Addi¬ 
tionally, many nuclear power plants experienced 
a transient due to this grid disturbance. 

Nuclear Power Plants That Tripped 

The trips at nine nuclear power plants resulted 
from the plant responses to the grid disturbances. 
Following the initial grid disturbances, voltages in 
the plant switchyard fluctuated and reactive 
power flows fluctuated. As the voltage regulators 
on the main generators attempted to compensate, 
equipment limits were exceeded and protective 
trips resulted. This happened at Fermi 2 and Oys¬ 
ter Creek. Fermi 2 tripped on a generator field pro¬ 
tection trip. Oyster Creek tripped due to a 
generator trip on high ratio of voltage relative to 
the electrical frequency. 

Also, as the balance between electrical generation 
and electrical load on the grid was disturbed, the 
electrical frequency began to fluctuate. In some 
cases the electrical frequency dropped low 
enough to actuate protective features. This hap¬ 
pened at Indian Point 2, Indian Point 3, and Perry. 
Perry tripped due to a generator under-frequency 
trip signal. Indian Point 2 and Indian Point 3 trip¬ 
ped when the grid frequency dropped low enough 
to trip reactor coolant pumps, which actuated a 
reactor protective feature. 
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In other cases, the electrical frequency fluctuated 
and went higher than normal. Turbine control sys¬ 
tems responded in an attempt to control the fre¬ 
quency. Equipment limits were exceeded as a 
result of the reaction of the turbine control sys¬ 
tems to large frequency changes. This led to trips 
at FitzPatrick, Nine Mile 1, Nine Mile 2, and 
Ginna. FitzPatrick and Nine Mile 2 tripped on low 
pressure in the turbine hydraulic control oil sys¬ 
tem. Nine Mile 1 tripped on turbine light load pro¬ 
tection. Ginna tripped due to conditions in the 
reactor following rapid closure of the turbine con¬ 
trol valves in response to high frequency on the 
grid. 

The Perry, Fermi 2, Oyster Creek, and Nine Mile 1 
reactors tripped immediately after the generator 
tripped, although that is not apparent from the 
times below, because the clocks were not synchro¬ 
nized to the national time standard. The Indian 
Point 2 and 3, FitzPatrick, Ginna, and Nine Mile 2 
reactors tripped before the generators. When the 
reactor trips first, there is generally a short time 
delay before the generator output breakers open. 
The electrical generation decreases rapidly to zero 
after the reactor trip. Table 8.1 provides the times 
from the data collected for the reactor trip times, 
and the time the generator output breakers opened 
(generator trip), as reported by the ESWG. Addi¬ 
tional details on the plants that tripped are given 
below, and summarized in Table 8.2 on page 120. 

Fermi 2. Fermi 2 is located 25 miles (40 km) north¬ 
east of Toledo, Ohio, in southern Michigan on 
Lake Erie. It was generating about 1,130 mega- 
watts-electric (MWe) before the event. The reactor 
tripped due to a turbine trip. The turbine trip was 
likely the result of multiple generator field protec¬ 
tion trips (overexcitation and loss of field) as the 
Fermi 2 generator responded to a series of rapidly 
changing transients prior to its loss. This is consis¬ 
tent with data that shows large swings of the Fermi 
2 generator MVAr prior to its trip. 

Offsite power was subsequently lost to the plant 
auxiliary buses. The safety buses were de¬ 
energized and automatically reenergized from the 
emergency diesel generators. The operators trip¬ 
ped one emergency diesel generator that was par¬ 
alleled to the grid for testing, after which it 
automatically loaded. Decay heat removal systems 
maintained the cooling function for the reactor 
fuel. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:22 EDT due to the 
loss of offsite power. Offsite power was restored to 


at least one safety bus at about 01:53 EDT on 
August 15. The following equipment problems 
were noted: the Combustion Turbine Generator 
(the alternate AC power source) failed to start from 
the control room; however, it was successfully 
started locally. In addition, the Spent Fuel Pool 
Cooling System was interrupted for approxi¬ 
mately 26 hours and reached a maximum temper¬ 
ature of 130 degrees Fahrenheit (55 degrees 
Celsius). The main generator was reconnected to 
the grid at about 01:41 EDT on August 20. 

FitzPatrick. FitzPatrick is located about 8 miles 
(13 km) northeast of Oswego, NY, in northern New 
York on Lake Ontario. It was generating about 850 
MWe before the event. The reactor tripped due to 
low pressure in the hydraulic system that controls 
the turbine control valves. Low pressure in this 
system typically indicates a large load reject, for 
which a reactor trip is expected. In this case the 
pressure in the system was low because the con¬ 
trol system was rapidly manipulating the turbine 
control valves to control turbine speed, which was 
being affected by grid frequency fluctuations. 

Immediately preceding the trip, both significant 
over-voltage and under-voltage grid conditions 
were experienced. Offsite power was subse¬ 
quently lost to the plant auxiliary buses. The 
safety buses were deenergized and automatically 
reenergized from the emergency diesel generators. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:26 EDT due to the 
loss of offsite power. Decay heat removal systems 
maintained the cooling function for the reactor 
fuel. Offsite power was restored to at least one 
safety bus at about 23:07 EDT on August 14. The 
main generator was reconnected to the grid at 
about 06:10 EDT on August 18. 


Table 8.1. U.S. Nuclear Plant Trip Times 


Nuclear Plant 

Reactor Trip 3 

Generator Trip b 

Perry. 

16:10:25 EDT 

16:10:42 EDT 

Fermi 2. 

16:10:53 EDT 

16:10:53 EDT 

Oyster Creek. . . 

16:10:58 EDT 

16:10:57 EDT 

Nine Mile 1 . . . . 

16:11 EDT 

16:11:04 EDT 

Indian Point 2 . . 

16:11 EDT 

16:11:09 EDT 

Indian Point 3 . . 

16:11 EDT 

16:11:23 EDT 

FitzPatrick. 

16:11:04 EDT 

16:11:32 EDT 

Ginna. 

16:11:36 EDT 

16:12:17 EDT 

Nine Mile 2 ... . 

16:11:48 EDT 

16:11:52 EDT 


a As determined from licensee data (which may not be syn¬ 
chronized to the national time standard). 

b As reported by the Electrical System Working Group (syn¬ 
chronized to the national time standard). 
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Ginna. Ginna is located 20 miles (32 km) north¬ 
east of Rochester, NY, in northern New York on 
Lake Ontario. It was generating about 487 MWe 
before the event. The reactor tripped due to Over- 
Temperature-Delta-Temperature. This trip signal 
protects the reactor core from exceeding tempera¬ 
ture limits. The turbine control valves closed 
down in response to the changing grid conditions. 
This caused a temperature and pressure transient 
in the reactor, resulting in an Over-Temperature- 
Delta-Temperature trip. 

Offsite power was not lost to the plant auxiliary 
buses. In the operators’ judgement, offsite power 
was not stable, so they conservatively energized 
the safety buses from the emergency diesel genera¬ 
tors. Decay heat removal systems maintained the 
cooling function for the reactor fuel. Offsite power 
was not lost, and stabilized about 50 minutes after 
the reactor trip. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:46 EDT due to the 
degraded offsite power. Offsite power was 
restored to at least one safety bus at about 21:08 
EDT on August 14. The following equipment 
problems were noted: the digital feedwater control 
system behaved in an unexpected manner follow¬ 
ing the trip, resulting in high steam generator lev¬ 
els; there was a loss of RCP seal flow indication, 
which complicated restarting the pumps; and at 
least one of the power-operated relief valves expe¬ 
rienced minor leakage following proper operation 
and closure during the transient. Also, one of the 
motor-driven auxiliary feedwater pumps was 
damaged after running with low flow conditions 
due to an improper valve alignment. The redun¬ 
dant pumps supplied the required water flow. 

The NRC issued a Notice of Enforcement Discre¬ 
tion to allow Ginna to perform mode changes and 
restart the reactor with one auxiliary feedwater 
(AFW) pump inoperable. Ginna has two AFW 
pumps, one turbine-driven AFW pump, and two 
standby AFW pumps, all powered from safety- 
related buses. The main generator was recon¬ 
nected to the grid at about 20:38 EDT on August 
17. 

Indian Point 2. Indian Point 2 is located 24 miles 
(39 km) north of New York City on the Hudson 
River. It was generating about 990 MWe before the 
event. The reactor tripped due to loss of a reactor 
coolant pump that tripped because the auxiliary 
bus frequency fluctuations actuated the under¬ 
frequency relay, which protects against inade¬ 
quate coolant flow through the reactor core. This 


reactor protection signal tripped the reactor, 
which resulted in turbine and generator trips. 

The auxiliary bus experienced the under¬ 
frequency due to fluctuating grid conditions. 
Offsite power was lost to all the plant auxiliary 
buses. The safety buses were reenergized from the 
emergency diesel generators. Decay heat removal 
systems maintained the cooling function for the 
reactor fuel. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:25 EDT due to the 
loss of offsite power for more than 15 minutes. 
Offsite power was restored to at least one safety 
bus at about 20:02 EDT on August 14. The follow¬ 
ing equipment problems were noted: the service 
water to one of the emergency diesel generators 
developed a leak; a steam generator atmospheric 
dump valve did not control steam generator pres¬ 
sure in automatic and had to be shifted to manual; 
a steam trap associated with the turbine-driven 
AFW pump failed open, resulting in operators 
securing the turbine after 2.5 hours; loss of instru¬ 
ment air required operators to take manual control 
of charging and a letdown isolation occurred; and 
operators in the field could not use radios; and the 
diesel generator for the Unit 2 Technical Support 
Center failed to function. Also, several uninter¬ 
ruptible power supplies in the Emergency Opera¬ 
tions Facility failed. This reduced the capability 
for communications and data collection. Alternate 
equipment was used to maintain vital communi¬ 
cations. 1 The main generator was reconnected to 
the grid at about 12:58 EDT on August 17. 

Indian Point 3. Indian Point 3 is located 24 miles 
(39 km) north of New York City on the Hudson 
River. It was generating about 1,010 MWe before 
the event. The reactor tripped due to loss of a reac¬ 
tor coolant pump that tripped because the auxil¬ 
iary bus frequency fluctuations actuated the 
under-frequency relay, which protects against 
inadequate coolant flow through the reactor core. 
This reactor protection signal tripped the reactor, 
which resulted in turbine and generator trips. 

The auxiliary bus experienced the under¬ 
frequency due to fluctuating grid conditions. 
Offsite power was lost to all the plant auxiliary 
buses. The safety buses were reenergized from the 
emergency diesel generators. Decay heat removal 
systems maintained the cooling function for the 
reactor fuel. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:23 EDT due to the 
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loss of offsite power for more than 15 minutes. 
Offsite power was restored to at least one safety 
bus at about 20:12 EDT on August 14. The follow¬ 
ing equipment problems were noted: a steam gen¬ 
erator safety valve lifted below its desired setpoint 
and was gagged; loss of instrument air, including 
failure of the diesel backup compressor to start 
and failure of the backup nitrogen system, 
resulted in manual control of atmospheric dump 
valves and AFW pumps needing to be secured to 
prevent overfeeding the steam generators; a blown 
fuse in a battery charger resulted in a longer bat¬ 
tery discharge; a control rod drive mechanism 
cable splice failed, and there were high resistance 
readings on 345-kV breaker-1. These equipment 
problems required correction prior to startup, 
which delayed the startup. The diesel generator 
for the Unit 3 Technical Support Center failed to 
function. Also, several uninterruptible power sup¬ 
plies in the Emergency Operations Facility failed. 
This reduced the capability for communications 
and data collection. Alternate equipment was 
used to maintain vital communications. 2 The 
main generator was reconnected to the grid at 
about 05:03 EDT on August 22. 

Nine Mile 1 . Nine Mile 1 is located 6 miles (10 km) 
northeast of Oswego, NY, in northern New York 
on Lake Ontario. It was generating about 600 MWe 
before the event. The reactor tripped in response 
to a turbine trip. The turbine tripped on light load 
protection (which protects the turbine against a 
loss of electrical load), when responding to fluctu¬ 
ating grid conditions. The turbine trip caused fast 
closure of the turbine valves, which, through 
acceleration relays on the control valves, create a 
signal to trip the reactor. After a time delay of 10 
seconds, the generator tripped on reverse power. 

The safety buses were automatically deenergized 
due to low voltage and automatically reenergized 
from the emergency diesel generators. Decay heat 
removal systems maintained the cooling function 
for the reactor fuel. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:33 EDT due to the 
degraded offsite power. Offsite power was 
restored to at least one safety bus at about 23:39 
EDT on August 14. The following additional 
equipment problems were noted: a feedwater 
block valve failed “as is” on the loss of voltage, 
resulting in a high reactor vessel level; fuses blew 
in fire circuits, causing control room ventilation 
isolation and fire panel alarms; and operators were 
delayed in placing shutdown cooling in service for 


several hours due to lack of procedure guidance to 
address particular plant conditions encountered 
during the shutdown. The main generator was 
reconnected to the grid at about 02:08 EDT on 
August 18. 

Nine Mile 2. Nine Mile 2 is located 6 miles (10 km) 
northeast of Oswego, NY, in northern New York 
on Lake Ontario. It was generating about 1,193 
MWe before the event. The reactor scrammed due 
to the actuation of pressure switches which 
detected low pressure in the hydraulic system that 
controls the turbine control valves. Low pressure 
in this system typically indicates a large load 
reject, for which a reactor trip is expected. In this 
case the pressure in the system was low because 
the control system was rapidly manipulating the 
turbine control valves to control turbine speed, 
which was being affected by grid frequency 
fluctuations. 

After the reactor tripped, several reactor level con¬ 
trol valves did not reposition, and with the main 
feedwater system continuing to operate, a high 
water level in the reactor caused a turbine trip, 
which caused a generator trip. Offsite power was 
degraded but available to the plant auxiliary 
buses. The offsite power dropped below the nor¬ 
mal voltage levels, which resulted in the safety 
buses being automatically energized from the 
emergency diesel generators. Decay heat removal 
systems maintained the cooling function for the 
reactor fuel. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 17:00 EDT due to the 
loss of offsite power to the safety buses for more 
than 15 minutes. Offsite power was restored to at 
least one safety bus at about 01:33 EDT on August 
15. The following additional equipment problem 
was noted: a tap changer on one of the offsite 
power transformers failed, complicating the resto¬ 
ration of one division of offsite power. The main 
generator was reconnected to the grid at about 
19:34 EDT on August 17. 

Oyster Creek. Oyster Creek is located 9 miles (14 
km) south of Toms River, NJ, near the Atlantic 
Ocean. It was generating about 629 MWe before 
the event. The reactor tripped due to a turbine trip. 
The turbine trip was the result of a generator trip 
due to actuation of a high Volts/Hz protective trip. 
The Volts/Hz trip is a generator/transformer pro¬ 
tective feature. The plant safety and auxiliary 
buses transferred from the main generator supply 
to the offsite power supply following the plant 
trip. Other than the plant transient, no equipment 
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or performance problems were determined to be 
directly related to the grid problems. 

Post-trip the operators did not get the mode switch 
to shutdown before main steam header pressure 
reached its isolation setpoint. The resulting MSIV 
closure complicated the operator’s response 
because the normal steam path to the main con¬ 
denser was lost. The operators used the isolation 
condensers for decay heat removal. The plant 
safety and auxiliary buses remained energized 
from offsite power for the duration of the event, 
and the emergency diesel generators were not 
started. Decay heat removal systems maintained 
the cooling function for the reactor fuel. The main 
generator was reconnected to the grid at about 
05:02 EDT on August 17. 

Perry. Perry is located 7 miles (11 km) northeast of 
Painesville, OH, in northern Ohio on Lake Erie. It 
was generating about 1,275 MWe before the event. 
The reactor tripped due to a turbine control valve 
fast closure trip signal. The turbine control valve 
fast closure trip signal was due to a generator 
under-frequency trip signal that tripped the gener¬ 
ator and the turbine and was triggered by grid fre¬ 
quency fluctuations. Plant operators noted voltage 
fluctuations and spikes on the main transformer, 
and the Generator Out-of-Step Supervisory relay 
actuated approximately 30 minutes before the 
trip. This supervisory relay senses a ground fault 
on the grid. The purpose is to prevent a remote 
fault on the grid from causing a generator out-of- 
step relay to activate, which would result in a gen¬ 
erator trip. Approximately 30 seconds prior to the 
trip operators noted a number of spikes on the gen¬ 
erator field volt meter, which subsequently went 
offscale high. The MVAr and MW meters likewise 
went offscale high. 

The safety buses were deenergized and automati¬ 
cally reenergized from the emergency diesel gen¬ 
erators. Decay heat removal systems maintained 
the cooling function for the reactor fuel. The fol¬ 
lowing equipment problems were noted: a steam 
bypass valve opened; a reactor water clean-up sys¬ 
tem pump tripped; the off-gas system isolated, and 
a keep-fill pump was found to be air-bound, 
requiring venting and filling before the residual 
heat removal system loop A and the low pressure 
core spray system could be restored to service. 

The lowest emergency declaration, an Unusual 
Event, was declared at about 16:20 EDT due to the 
loss of offsite power. Offsite power was restored to 
at least one safety bus at about 18:13 EDT on 
August 14. The main generator was reconnected 


to the grid at about 23:15 EDT on August 21. After 
the plant restarted, a surveillance test indicated a 
problem with one emergency diesel generator. 3 

Nuclear Power Plants With a Significant 
Transient 

The electrical disturbance on August 14 had a sig¬ 
nificant impact on seven plants that continued to 
remain connected to the grid. For this review, sig¬ 
nificant impact means that these plants had signif¬ 
icant load adjustments that resulted in bypassing 
steam from the turbine generator, opening of relief 
valves, or requiring the onsite emergency diesel 
generators to automatically start due to low 
voltage. 

Nuclear Power Plants With a Non-Significant 
Transient 

Sixty-four nuclear power plants experienced 
non-significant transients caused by minor distur¬ 
bances on the electrical grid. These plants were 
able to respond to the disturbances through nor¬ 
mal control systems. Examples of these transients 
included changes in load of a few megawatts or 
changes in frequency of a few-tenths Hz. 

Nuclear Power Plants With No Transient 

Twenty-four nuclear power plants experienced no 
transient and saw essentially no disturbances on 
the grid, or were shut down at the time of the 
transient. 

General Observations Based on the Facts 
Found During Phase One 

The NWG found no evidence that the shutdown of 
U.S. nuclear power plants triggered the outage or 
inappropriately contributed to its spread (i.e., to 
an extent beyond the normal tripping of the plants 
at expected conditions). This review did not iden¬ 
tify any activity or equipment issues that appeared 
to start the transient on August 14, 2003. All nine 
plants that experienced a reactor trip were 
responding to grid conditions. The severity of the 
transient caused generators, turbines, or reactor 
systems to reach a protective feature limit and 
actuate a plant shutdown. 

All nine plants tripped in response to those condi¬ 
tions in a manner consistent with the plant 
designs. All nine plants safely shut down. All 
safety functions were effectively accomplished, 
with few problems, and the plants were main¬ 
tained in a safe shutdown condition until their 
restart. Fermi 2, Nine Mile 1, Oyster Creek, and 
Perry tripped on turbine and generator protective 
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features. FitzPatrick, Ginna, Indian Point 2 and 3, 
and Nine Mile 2 tripped on reactor protective 
features. 

Nine plants used their emergency diesel genera¬ 
tors to power their safety-related buses during the 
power outage. Offsite power was restored to the 
safety buses after the grid was energized and the 
plant operators, in consultation with the transmis¬ 
sion system operators, decided the grid was stable. 
Although the Oyster Creek plant tripped, offsite 
power was never lost to their safety buses and the 
emergency diesel generators did not start and 
were not required. Another plant, Davis-Besse, 
was already shut down but lost power to the safety 
buses. The emergency diesel generators started 
and provided power to the safety buses as 
designed. 

For the eight remaining tripped plants and 
Davis-Besse (which was already shut down prior 
to the events of August 14), offsite power was 
restored to at least one safety bus after a period of 
time ranging from about 2 hours to about 14 hours, 
with an average time of about 7 hours. Although 
Ginna did not lose offsite power, the operators 
judged offsite power to be unstable and realigned 
the safety buses to the emergency diesel 
generators. 

The licensees’ return to power operation follows a 
deliberate process controlled by plant procedures 
and NRC regulations. Ginna, Indian Point 2, Nine 
Mile 2, and Oyster Creek resumed electrical gener¬ 
ation on August 17. FitzPatrick and Nine Mile 1 
resumed electrical generation on August 18. Fermi 
2 resumed electrical generation on August 20. 
Perry resumed electrical generation on August 21. 
Indian Point 3 resumed electrical generation on 


August 22. Indian Point 3 had equipment issues 
(failed splices in the control rod drive mechanism 
power system) that required repair prior to restart. 
Ginna submitted a special request for enforcement 
discretion from the NRC to permit mode changes 
and restart with an inoperable auxiliary feedwater 
pump. The NRC granted the request for enforce¬ 
ment discretion. 

Conclusions of the U.S. Nuclear 
Working Group 

As discussed above, the investigation of the U.S. 
nuclear power plant responses during the 
blackout found no significant deficiencies. 
Accordingly, there are no recommendations here 
concerning U.S. nuclear power plants. Some areas 
for consideration on a grid-wide basis were dis¬ 
cussed and forwarded to the Electric System 
Working Group for their review. 

On August 14, 2003, nine U.S. nuclear power 
plants tripped as a result of the loss of offsite 
power. Nuclear power plants are designed to cope 
with the loss of offsite power (LOOP) through the 
use of emergency power supplies (primarily 
on-site diesel generators). The safety function of 
most concern during a LOOP is the removal of 
heat from the reactor core. Although the control 
rods have been inserted to stop the fission process, 
the continuing decay of radioactive isotopes in the 
reactor core produces a significant amount of heat 
for many weeks. If this decay heat is not removed, 
it will cause fuel damage and the release of highly 
radioactive isotopes from the reactor core. The 
failure of the alternating current emergency power 
supplies in conjunction with a LOOP is known 
as a station blackout. Failures of the emergency 


Table 8.2. Summary of Events for U. S. Nuclear Power Plants 




Operating Status 
at Time of Event 

Response to Event 

Nuclear Plant 

Unit 

Full Power 

Not Operating 

Reactor and 
Turbine Trip 

Emergency 
Diesels used 

Davis-Besse (near Toledo, OH). 

1 


V 


V 

Fermi (near Toledo, OH). 

2 

V 


V 

V 

James A. FitzPatrick (near Oswego, NY). . 

1 

V 


V 

V 

Ginna (near Rochester, NY). 

1 

V 


V 

V 

Indian Point (near New York City, NY).... 

2 

V 


V 

V 


3 

V 


V 

V 

Nine Mile Point (near Oswego, NY). 

1 

V 


V 

V 


2 

V 


V 

V 

Oyster Creek (near Toms River, NJ). 

1 

V 


V 


Perry (near Painesville, OH). 

1 

V 


V 

V 
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power supplies would seriously hinder the ability 
of the plant operators to carry out the required 
safety functions. Nuclear plants can cope with a 
station blackout for a limited time without suffer¬ 
ing fuel damage. However, recovery of the grid or 
the restoration of an emergency power supply is 
needed for long-term decay heat removal. For this 
reason, the NRC considers LOOP events to be 
potential precursors to more serious situations. 
The risk of reactor core damage increases as the 
LOOP frequency or duration increases. 

Offsite power is considered the preferred power 
source for responding to all off-normal events or 
accidents. However, if the grid is operated in a 
stressed configuration, the loss of the nuclear 
plant generation may result in grid voltage drop¬ 
ping below the level needed for the plant safety 
loads. In that case, each plant is designed such 
that voltage relays will automatically disconnect 
the plant safety-related electrical buses from the 
grid and reenergize them from the emergency die¬ 
sel generators (EDGs). Although the resultant 
safety system responses have been analyzed and 
found acceptable, the loss of offsite power reduces 
the plant’s safety margin. It also increases the risk 
associated with failures of the EDGs. For these rea¬ 
sons, the NRC periodically assesses the impact of 
grid reliability on overall nuclear plant safety. 

The NRC monitors grid reliability under its nor¬ 
mal monitoring programs, such as the operating 
experience program, and has previously issued 
reports related to grid reliability. The NRC is con¬ 
tinuing with an internal review of the reliability of 
the electrical grid and the effect on the risk profile 
for nuclear power plants. The NRC will consider 
the implications of the August 14, 2003, Northeast 
blackout under the NRC’s regulations. The NRC 
is conducting an internal review of its station 
blackout rule, and the results of the August 14th 
event will be factored into that review. If there are 
additional findings, the NRC will address them 
through the NRC’s normal process. 

Findings of the Canadian Nuclear 
Working Group 

Summary 

On the afternoon of August 14, 2003, southern 
Ontario, along with the northeastern United 
States, experienced a widespread electrical power 
system outage. Eleven nuclear power plants in 
Ontario operating at high power levels at the time 


of the event either automatically shut down as a 
result of the grid disturbance or automatically 
reduced power while waiting for the grid to be 
reestablished. In addition, the Point Lepreau 
Nuclear Generating Station in New Brunswick 
was forced to reduce electricity production for a 
short period. 

The Canadian NWG (CNWG) was mandated to: 
review the sequence of events for each Canadian 
nuclear plant; determine whether any events 
caused or contributed to the power system outage; 
evaluate any potential safety issues arising as a 
result of the event; evaluate the effect on safety 
and the reliability of the grid of design features, 
operating procedures, and regulatory require¬ 
ments at Canadian nuclear power plants; and 
assess the impact of associated regulator perfor¬ 
mance and regulatory decisions. 

In Ontario, 11 nuclear units were operating and 
delivering power to the grid at the time of the grid 
disturbance: 4 at Bruce B, 4 at Darlington, and 3 at 
Pickering B. Of the 11 reactors, 7 shut down as a 
result of the event (1 at Bruce B, 3 at Darlington, 
and 3 at Pickering B). Four reactors (3 at Bruce B 
and 1 at Darlington) disconnected safely from the 
grid but were able to avoid shutting down and 
were available to supply power to the Ontario grid 
as soon as reconnection was enabled by Ontario’s 
Independent Market Operator (IMO). 

New Brunswick Power’s Point Lepreau Generating 
Station responded to the loss of grid event by cut¬ 
ting power to 460 MW, returning to fully stable 
conditions at 16:35 EDT, within 25 minutes of the 
event. Hydro Quebec’s (HQ) grid was not affected 
by the power system outage, and HQ’s Gentilly-2 
nuclear station continued to operate normally. 

Having reviewed the operating data for each plant 
and the responses of the power stations and their 
staff to the event, the CNWG concludes the 
following: 

♦ None of the reactor operators had any advanced 
warning of impending collapse of the grid. 

>- Trend data obtained indicate stable condi¬ 
tions until a few minutes before the event. 

>■ There were no prior warnings from Ontario’s 
IMO. 

♦ Canadian nuclear power plants did not trigger 
the power system outage or contribute to its 
spread. Rather they responded, as anticipated, 
in order to protect equipment and systems from 
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the grid disturbances. Plant data confirm the 
following. 

>■ At Bruce B and Pickering B, frequency and/or 
voltage fluctuations on the grid resulted in 
the automatic disconnection of generators 
from the grid. For those units that were suc¬ 
cessful in maintaining the unit generators 
operational, reactor power was automatically 
reduced. 

> At Darlington, load swing on the grid led to 
the automatic reduction in power of the four 
reactors. The generators were, in turn, auto¬ 
matically disconnected from the grid. 

>> Three reactors at Bruce B and one at Darling¬ 
ton were returned to 60% power. These reac¬ 
tors were available to deliver power to the 
grid on the instructions of the IMO. 

>> Three units at Darlington were placed in a 
zero-power hot state, and four units at 
Pickering B and one unit at Bruce B were 
placed in a guaranteed shutdown state. 

♦ There were no risks to health and safety of 
workers or the public as a result of the shut¬ 
down of the reactors. 

>- Turbine, generator, and reactor automatic 
safety systems worked as designed to 
respond to the loss of grid. 

Station operating staff and management fol¬ 
lowed approved Operating Policies & Princi¬ 
ples (OP&Ps) in responding to the loss of grid. 
At all times, operators and shift supervisors 
made appropriately conservative decisions in 
favor of protecting health and safety. 

The CNWG commends the staff of Ontario Power 
Generation and Bruce Power for their response to 
the power system outage. At all times, staff acted 
in accordance with established OP&Ps, and took 
an appropriately conservative approach to 
decisions. 

During the course of its review, the CNWG also 
identified the following secondary issues: 

♦ Equipment problems and design limitations at 
Pickering B resulted in a temporary reduction in 
the effectiveness of some of the multiple safety 
barriers, although the equipment failure was 
within the unavailability targets found in the 
OP&Ps approved by the CNSC as part of Ontario 
Power Generation’s licence. 

♦ Existing OP&Ps place constraints on the use of 
adjuster rods to respond to events involving 


rapid reductions in reactor power. While 
greater flexibility with respect to use of adjuster 
rods would not have prevented the shutdown, 
some units, particularly those at Darlington, 
might have been able to return to service less 
than 1 hour after the initiating event. 

♦ Off-site power was unavailable for varying peri¬ 
ods of time, from approximately 3 hours at 
Bruce B to approximately 9 hours at Pickering 
A. Despite the high priority assigned by the IMO 
to restoring power to the nuclear stations, the 
stations had some difficulty in obtaining timely 
information about the status of grid recovery 
and the restoration of Class IV power. This 
information is important for Ontario Power 
Generation’s and Bruce Power’s response 
strategy. 

♦ Required regulatory approvals from CNSC staff 
were obtained quickly and did not delay the 
restart of the units; however, CNSC staff was 
unable to immediately activate the CNSC’s 
Emergency Operation Centre because of loss of 
power to the CNSC’s head office building. 
CNSC staff, therefore, established communica¬ 
tions with licensees and the U.S. NRC from 
other locations. 

Introduction 

The primary focus of the CNWG during Phase I 
was to address nuclear power plant response rele¬ 
vant to the power outage of August 14, 2003. Data 
were collected from each power plant and ana¬ 
lyzed in order to determine: the cause of the power 
outage; whether any activities at these plants 
caused or contributed to the power outage; and 
whether there were any significant safety issues. 
In order to obtain reliable and comparable infor¬ 
mation and data from each nuclear power plant, a 
questionnaire was developed to help pinpoint 
how each nuclear power plant responded to the 
August 14 grid transients. Where appropriate, 
additional information was obtained from the 
ESWG and SWG. 

The operating data from each plant were com¬ 
pared against the plant design specifications to 
determine whether the plants responded as 
expected. Based on initial plant responses to the 
questionnaire, supplemental questions were 
developed, as required, to further clarify outstand¬ 
ing matters. Supplementary information on the 
design features of Ontario’s nuclear power plants 
was also provided by Ontario Power Generation 
and Bruce Power. The CNWG also consulted a 
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number of subject area specialists, including 
CNSC staff, to validate the responses to the ques¬ 
tionnaire and to ensure consistency in their 
interpretation. 

In addition to the stakeholder consultations dis¬ 
cussed in the Introduction to this chapter, CNSC 
staff met with officials from Ontario’s Independ¬ 
ent Market Operator on January 7, 2004. 

Typical Design, Operational, and 
Protective Features of CANDU Nuclear 
Power Plants 

There are 22 CANDU nuclear power reactors in 
Canada—20 located in Ontario at 5 multi-unit sta¬ 
tions (Pickering A and Pickering B located in 
Pickering, Darlington located in the Municipality 
of Clarington, and Bruce A and Bruce B located 
near Kincardine). There are also single-unit 
CANDU stations at Becancour, Quebec (Gentilly- 
2), and Point Lepreau, New Brunswick. 

In contrast to the pressurized water reactors used 
in the United States, which use enriched uranium 
fuel and a light water coolant-moderator, all 
housed in a single, large pressure vessel, a CANDU 
reactor uses fuel fabricated from natural uranium, 
with heavy water as the coolant and moderator. 
The fuel and pressurized heavy water coolant are 
contained in 380 to 480 pressure tubes housed in a 
calandria containing the heavy water moderator 
under low pressure. Heat generated by the fuel is 
removed by heavy water coolant that flows 
through the pressure tubes and is then circulated 
to the boilers to produce steam from demineral¬ 
ized water. 

While the use of natural uranium fuel offers 
important benefits from the perspectives of safe¬ 
guards and operating economics, one drawback is 
that it restricts the ability of a CANDU reactor to 
recover from a large power reduction. In particu¬ 
lar, the lower reactivity of natural uranium fuel 
means that CANDU reactors are designed with a 
small number of control rods (called “adjuster 
rods”) that are only capable of accommodating 
power reductions to 60%. The consequence of a 
larger power reduction is that the reactor will “poi¬ 
son out” and cannot be made critical for up to 2 
days following a power reduction. By comparison, 
the use of enriched fuel enables a typical pressur¬ 
ized water reactor to operate with a large number 
of control rods that can be withdrawn to accom¬ 
modate power reductions to zero power. 

A unique feature of some CANDU plants— 
namely, Bruce B and Darlington—is a capability to 


maintain the reactor at 60% full power if the gen¬ 
erator becomes disconnected from the grid and to 
maintain this “readiness” condition if necessary 
for days. Once reconnected to the grid, the unit 
can be loaded to 60% full power within several 
minutes and can achieve full power within 24 
hours. 

As with other nuclear reactors, CANDU reactors 
normally operate continuously at full power 
except when shut down for maintenance and 
inspections. As such, while they provide a stable 
source of baseload power generation, they cannot 
provide significant additional power in response 
to sudden increases in demand. CANDU power 
plants are not designed for black-start operation; 
that is, they are not designed to start up in the 
absence of power from the grid. 

Electrical Distribution Systems 

The electrical distribution systems at nuclear 
power plants are designed to satisfy the high 
safety and reliability requirements for nuclear sys¬ 
tems. This is achieved through flexible bus 
arrangements, high capacity standby power gener¬ 
ation, and ample redundancy in equipment. 

Where continuous power is required, power is 
supplied either from batteries (for continuous DC 
power, Class I) or via inverters (for continuous AC 
power, Class II). AC supply for safety-related 
equipment, which can withstand short interrup¬ 
tion (on the order of 5 minutes), is provided by 
Class III power. Class III power is nominally sup¬ 
plied through Class IV; when Class IV becomes 
unavailable, standby generators are started auto¬ 
matically, and the safety-related loads are picked 
up within 5 minutes of the loss of Class IV power. 

The Class IV power is an AC supply to reactor 
equipment and systems that can withstand longer 
interruptions in power. Class IV power can be sup¬ 
plied either from the generator through a trans¬ 
former or from the grid by another transformer. 
Class IV power is not required for reactors to shut 
down safely. 

In addition to the four classes of power described 
above, there is an additional source of power 
known as the Emergency Power System (EPS). 
EPS is a separate power system consisting of its 
own on-site power generation and AC and DC dis¬ 
tribution systems whose normal supply is from 
the Class III power system. The purpose of the EPS 
system is to provide power to selected safety- 
related loads following common mode incidents, 
such as seismic events. 
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Protective Features of CANDU Nuclear Power 
Plants 

CANDU reactors typically have two separate, 
independent and diverse systems to shut down 
the reactor in the event of an accident or transients 
in the grid. Shutdown System 1 (SDSl) consists of 
a large number of cadmium rods that drop into the 
core to decrease the power level by absorbing neu¬ 
trons. Shutdown System 2 (SDS2) consists of 
high-pressure injection of gadolinium nitrate into 
the low-pressure moderator to decrease the power 
level by absorbing neutrons. Although Pickering A 
does not have a fully independent SDS2, it does 
have a second shutdown mechanism, namely, the 
fast drain of the moderator out of the calandria; 
removal of the moderator significantly reduces the 
rate of nuclear fission, which reduces reactor 
power. Also, additional trip circuits and shutoff 
rods have recently been added to Pickering A Unit 
4 (Shutdown System Enhancement, or SDS-E). 
Both SDSl and SDS2 are capable of reducing reac¬ 
tor power from 100% to about 2% within a few 
seconds of trip initiation. 

Fuel Heat Removal Features of CANDU 
Nuclear Power Plants 

Following the loss of Class IV power and shut¬ 
down of the reactor through action of SDSl and/or 
SDS2, significant heat will continue to be gener¬ 
ated in the reactor fuel from the decay of fission 
products. The CANDU design philosophy is to 
provide defense in depth in the heat removal 
systems. 

Immediately following the trip and prior to resto¬ 
ration of Class III power, heat will be removed 
from the reactor core by natural circulation of 
coolant through the Heat Transport System main 
circuit following rundown of the main Heat Trans¬ 
port pumps (first by thermosyphoning and later by 
intermittent buoyancy induced flow). Heat will be 
rejected from the secondary side of the steam gen¬ 
erators through the atmospheric steam discharge 
valves. This mode of operation can be sustained 
for many days with additional feedwater supplied 
to the steam generators via the Class III powered 
auxiliary steam generator feed pump(s). 

In the event that the auxiliary feedwater system 
becomes unavailable, there are two alternate EPS 
powered water supplies to steam generators, 
namely, the Steam Generator Emergency Coolant 
System and the Emergency Service Water System. 
Finally, a separate and independent means of 
cooling the fuel is by forced circulation by means 


of the Class III powered shutdown cooling system; 
heat removal to the shutdown cooling heat 
exchangers is by means of the Class III powered 
components of the Service Water System. 

CANDU Reactor Response to 
Loss-of-Grid Event 

Response to Loss of Grid 

In the event of disconnection from the grid, power 
to shut down the reactor safely and maintain 
essential systems will be supplied from batteries 
and standby generators. The specific response of a 
reactor to disconnection from the grid will depend 
on the reactor design and the condition of the unit 
at the time of the event. 

60% Reactor Power: All CANDU reactors are 
designed to operate at 60% of full power following 
the loss of off-site power. They can operate at this 
level as long as demineralized water is available 
for the boilers. At Darlington and Bruce B, steam 
can be diverted to the condensers and recirculated 
to the boilers. At Pickering A and Pickering B, 
excess steam is vented to the atmosphere, thereby 
limiting the operating time to the available inven¬ 
tory of demineralized water. 

0% Reactor Power, Hot: The successful transition 
from 100% to 60% power depends on several sys¬ 
tems responding properly, and continued opera¬ 
tion is not guaranteed. The reactor may shut down 
automatically through the operation of the process 
control systems or through the action of either of 
the shutdown systems. 

Should a reactor shutdown occur following a load 
rejection, both Class IV power supplies (from the 
generator and the grid) to that unit will become 
unavailable. The main Heat Transport pumps 
will trip, leading to a loss of forced circulation of 
coolant through the core. Decay heat will be con¬ 
tinuously removed through natural circulation 
(thermosyphoning) to the boilers, and steam pro¬ 
duced in the boilers will be exhausted to the 
atmosphere via atmospheric steam discharge 
valves. The Heat Transport System will be main¬ 
tained at around 250 to 265 degrees Celsius during 
thermosyphoning. Standby generators will start 
automatically and restore Class III power to key 
safety-related systems. Forced circulation in the 
Heat Transport System will be restored once 
either Class III or Class IV power is available. 

When shut down, the natural decay of fission 
products will lead to the temporary buildup of 
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neutron absorbing elements in the fuel. If the reac¬ 
tor is not quickly restarted to reverse this natural 
process, it will “poison-out.” Once poisoned-out, 
the reactor cannot return to operation until the fis¬ 
sion products have further decayed, a process 
which typically takes up to 2 days. 

Overpoisoned Guaranteed Shutdown State: In 

the event that certain problems are identified 
when reviewing the state of the reactor after a sig¬ 
nificant transient, the operating staff will cool 
down and depressurize the reactor, then place it in 
an overpoisoned guaranteed shutdown state (GSS) 
through the dissolution of gadolinium nitrate into 
the moderator. Maintenance will then be initiated 
to correct the problem. 

Return to Service Following Loss of Grid 

The return to service of a unit following any one of 
the above responses to a loss-of-grid event is dis¬ 
cussed below. It is important to note that the 
descriptions provided relate to operations on a 
single unit. At multi-unit stations, the return to 
service of several units cannot always proceed in 
parallel, due to constraints on labor availability 
and the need to focus on critical evolutions, such 
as taking the reactor from a subcritical to a critical 
state. 

60% Reactor Power: In this state, the unit can be 
resynchronized consistent with system demand, 
and power can be increased gradually to full 
power over approximately 24 hours. 

0% Reactor Power, Hot: In this state, after approx¬ 
imately 2 days for the poison-out, the turbine can 
be run up and the unit synchronized. Thereafter, 
power can be increased to high power over the 
next day. This restart timeline does not include 
the time required for any repairs or maintenance 
that might have been necessary during the outage. 

Overpoisoned Guaranteed Shutdown State: Plac¬ 
ing the reactor in a GSS after it has been shut down 
requires approximately 2 days. Once the condi¬ 
tion that required entry to the GSS is rectified, the 
restart requires removal of the guarantee, removal 
of the gadolinium nitrate through ion exchange 
process, heatup of the Heat Transport System, and 
finally synchronization to the grid. Approximately 
4 days are required to complete these restart activ¬ 
ities. In total, 6 days from shutdown are required 
to return a unit to service from the GSS, and this 
excludes any repairs that might have been 
required while in the GSS. 


Summary of Canadian Nuclear Power 
Plant Response to and Safety During the 
August 14 Outage 

On the afternoon of August 14, 2003, 15 Canadian 
nuclear units were operating: 13 in Ontario, 1 in 
Quebec, and 1 in New Brunswick. Of the 13 
Ontario reactors that were critical at the time of 
the event, 11 were operating at or near full power 
and 2 at low power (Pickering B Unit 7 and 
Pickering A Unit 4). All 13 of the Ontario reactors 
disconnected from the grid as a result of the grid 
disturbance. Seven of the 11 reactors operating at 
high power shut down, while the remaining 4 
operated in a planned manner that enabled them 
to remain available to reconnect to the grid at the 
request of Ontario’s IMO. Of the 2 Ontario reactors 
operating at low power, Pickering A Unit 4 tripped 
automatically, and Pickering B Unit 7 was tripped 
manually and shut down. In addition, a transient 
was experienced at New Brunswick Power’s Point 
Lepreau Nuclear Generating Station, resulting in a 
reduction in power. Hydro Quebec’s Gentilly-2 
nuclear station continued to operate normally as 
the Hydro Quebec grid was not affected by the grid 
disturbance. 

Nuclear Power Plants With Significant 
Transients 

Pickering Nuclear Generating Station. The 

Pickering Nuclear Generating Station (PNGS) is 
located in Pickering, Ontario, on the shores of 
Lake Ontario, 19 miles (30 km) east of Toronto. It 
houses 8 nuclear reactors, each capable of deliver¬ 
ing 515 MW to the grid. Three of the 4 units at 
Pickering A (Units 1 through 3) have been shut 
down since late 1997. Unit 4 was restarted earlier 
this year following a major refurbishment and was 
in the process of being commissioned at the time 
of the event. At Pickering B, 3 units were operating 
at or near 100% prior to the event, and Unit 7 was 
being started up following a planned maintenance 
outage. 

Pickering A. As part of the commissioning process, 
Unit 4 at Pickering A was operating at 12% power 
in preparation for synchronization to the grid. The 
reactor automatically tripped on SDSl due to Heat 
Transport Low Coolant Flow, when the Heat 
Transport main circulating pumps ran down fol¬ 
lowing the Class IV power loss. The decision was 
then made to return Unit 4 to the guaranteed shut¬ 
down state. Unit 4 was synchronized to the grid on 
August 20, 2003. Units 1, 2 and 3 were in lay-up 
mode. 
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Pickering B. The Unit 5 Generator Excitation Sys¬ 
tem transferred to manual control due to large 
voltage oscillations on the grid at 16:10 EDT and 
then tripped on Loss of Excitation about 1 second 
later (prior to grid frequency collapse). In response 
to the generator trip, Class IV buses transferred to 
the system transformer and the reactor setback. 
The grid frequency collapse caused the System 
Service Transformer to disconnect from the grid, 
resulting in a total loss of Class IV power. The 
reactor consequently tripped on the SDSl Low 
Gross Flow parameter followed by an SDS2 trip 
due to Low Core Differential Pressure. 

The Unit 6 Generator Excitation System also 
transferred to manual control at 16:10 EDT due to 
large voltage oscillations on the grid and the gen¬ 
erator remained connected to the grid in manual 
voltage control. Approximately 65 seconds into 
the event, the grid under-frequency caused all the 
Class IV buses to transfer to the Generator Service 
Transformer. Ten seconds later, the generator sep¬ 
arated from the Grid. Five seconds later, the gener¬ 
ator tripped on Loss of Excitation, which caused a 
total loss of Class IV power. The reactor conse¬ 
quently tripped on the SDSl Low Gross Flow 
parameter, followed by an SDS2 trip due to Low 
Core Differential Pressure. 

Unit 7 was coming back from a planned mainte¬ 
nance outage and was at 0.9% power at the time of 
the event. The unit was manually tripped after 
loss of Class IV power, in accordance with proce¬ 
dures and returned to guaranteed shutdown state. 

Unit 8 reactor automatically set back on load rejec¬ 
tion. The setback would normally have been ter¬ 
minated at 20% power but continued to 2% power 
because of the low boiler levels. The unit subse¬ 
quently tripped on the SDSl Low Boiler Feedline 
Pressure parameter due to a power mismatch 
between the reactor and the turbine. 

The following equipment problems were noted. At 
Pickering, the High Pressure Emergency Coolant 
Injection System (HPECIS) pumps are designed to 
operate from a Class IV power supply. As a result 
of the shutdown of all the operating units, the 
HPECIS at both Pickering A and Pickering B 
became unavailable for 5.5 hours. (The design of 
Pickering A and Pickering B HPECIS must be such 
that the fraction of time for which it is not avail¬ 
able can be demonstrated to be less than 10' 3 
years—about 8 hours per year. This was the first 
unavailability of the HPECIS for 2003.) In addi¬ 
tion, Emergency High Pressure Service Water 
System restoration for all Pickering B units was 


delayed because of low suction pressure supply¬ 
ing the Emergency High Pressure Service Water 
pumps. Manual operator intervention was 
required to restore some pumps back to service. 

Units were synchronized to the grid as follows: 
Unit 8 on August 22, Unit 5 on August 23, Unit 6 
on August 25, and Unit 7 on August 29. 

Darlington Nuclear Generating Station. Four 
reactors are located at the Darlington Nuclear Gen¬ 
eration Station, which is on the shores of Lake 
Ontario in the Municipality of Clarington, 43 
miles (70 km) east of Toronto. All four of the reac¬ 
tors are licensed to operate at 100% of full power, 
and each is capable of delivering approximately 
880 MW to the grid. 

Unit 1 automatically stepped back to the 60% 
reactor power state upon load rejection at 16:12 
EDT. Approval by the shift supervisor to automati¬ 
cally withdraw the adjuster rods could not be pro¬ 
vided due to the brief period of time for the shift 
supervisor to complete the verification of systems 
as per procedure. The decreasing steam pressure 
and turbine frequency then required the reactor to 
be manually tripped on SDSl, as per procedure for 
loss of Class IV power. The trip occurred at 16:24 
EDT, followed by a manual turbine trip due to 
under-frequency concerns. 

Like Unit 1, Unit 2 automatically stepped back 
upon load rejection at 16:12 EDT. As with Unit 1, 
there was insufficient time for the shift supervisor 
to complete the verification of systems, and faced 
with decreasing steam pressure and turbine fre¬ 
quency, the decision was made to shut down Unit 
2. Due to under-frequency on the main Primary 
Heat Transport pumps, the turbine was tripped 
manually which resulted in an SDSl trip at 16:28 
EDT. 

Unit 3 experienced a load rejection at 16:12 EDT, 
and during the stepback Unit 3 was able to sustain 
operation with steam directed to the condensers. 
After system verifications were complete, approv¬ 
al to place the adjuster rods on automatic was 
obtained in time to recover, at 59% reactor power. 
The unit was available to resynchronize to the 
grid. 

Unit 4 experienced a load rejection at 16:12 EDT, 
and required a manual SDSl trip due to the loss of 
Class II bus. This was followed by a manual tur¬ 
bine trip. 

The following equipment problems were noted: 
Unit 4 Class II inverter trip on BUS A3 and 
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subsequent loss of critical loads prevented unit 
recovery. The Unit 0 Emergency Power System 
BUS B135 power was lost until the Class III power 
was restored. (A planned battery bank B135 
change out was in progress at the time of the 
blackout.) 

Units were synchronized to the grid as follows: 
Unit 3 at 22:00 EDT on August 14; Unit 2 on 
August 17, 2003; Unit 1 on August 18, 2003; and 
Unit 4 on August 18, 2003. 

Bruce Power. Eight reactors are located at Bruce 
Power on the eastern shore of Lake Huron between 
Kincardine and Port Elgin, Ontario. Units 5 
through 8 are capable of generating 840 MW each. 
Presently these reactors are operating at 90% of 
full power due to license conditions imposed by 
the CNSC. Units 1 through 4 have been shut down 
since December 31, 1997. At the time of the event, 
work was being performed to return Units 3 and 4 
to service. 

Bruce A. Although these reactors were in guaran¬ 
teed shutdown state, they were manually tripped, 
in accordance with operating procedures. SDSl 
was manually tripped on Units 3 and 4, as per pro¬ 
cedures for a loss of Class IV power event. SDSl 
was re-poised on both units when the station 
power supplies were stabilized. The emergency 
transfer system functioned as per design, with the 
Class III standby generators picking up station 
electrical loads. The recently installed Qualified 
Diesel Generators received a start signal and were 
available to pick up emergency loads if necessary. 

Bruce B. Units 5, 6, 7, and 8 experienced initial 
generation rejection and accompanying stepback 
on all four reactor units. All generators separated 
from the grid on under-frequency at 16:12 EDT. 
Units 5, 7, and 8 maintained reactor power at 60% 
of full power and were immediately available for 
reconnection to the grid. 

Although initially surviving the loss of grid event, 
Unit 6 experienced an SDSl trip on insufficient 
Neutron Over Power (NOP) margin. This occurred 
while withdrawing Bank 3 of the adjusters in an 
attempt to offset the xenon transient, resulting in a 
loss of Class IV power. 

The following equipment problems were noted: 
An adjuster rod on Unit 6 had been identified on 
August 13, 2003, as not working correctly. Unit 6 
experienced a High Pressure Recirculation Water 
line leak, and the Closed Loop Demineralized 
Water loop lost inventory to the Emergency Water 
Supply System. 


Units were synchronized to the grid as follows: 
Unit 8 at 19:14 EDT on August 14, 2003; Unit 5 at 
21:04 EDT on August 14; and Unit 7 at 21:14 EDT 
on August 14, 2003. Unit 6 was resynchronized at 
02:03 EDT on August 23, 2003, after maintenance 
was conducted. 

Point Lepreau Nuclear Generating Station. The 

Point Lepreau nuclear station overlooks the Bay of 
Fundy on the Lepreau Peninsula, 25 miles (40 km) 
southwest of Saint John, New Brunswick. Point 
Lepreau is a single-unit CANDU 6, designed for a 
gross output of 680 MW. It is owned and operated 
by New Brunswick Power. 

Point Lepreau was operating at 91.5% of full 
power (610 MWe) at the time of the event. When 
the event occurred, the unit responded to changes 
in grid frequency as per design. The net impact 
was a short-term drop in output by 140 MW, with 
reactor power remaining constant and excess ther¬ 
mal energy being discharged via the unit steam 
discharge valves. During the 25 seconds of the 
event, the unit stabilizer operated numerous times 
to help dampen the turbine generator speed oscil¬ 
lations that were being introduced by the grid fre¬ 
quency changes. Within 25 minutes of the event 
initiation, the turbine generator was reloaded to 
610 MW. Given the nature of the event that 
occurred, there were no unexpected observations 
on the New Brunswick Power grid or at Point 
Lepreau Generating Station throughout the ensu¬ 
ing transient. 

Nuclear Power Plants With No Transient 

Gentilly-2 Nuclear Station. Hydro Quebec owns 
and operates Gentilly-2 nuclear station, located on 
the south shore of the St. Lawrence River opposite 
the city of Trois-Rivieres, Quebec. Gentilly-2 is 
capable of delivering approximately 675 MW to 
Hydro Quebec’s grid. The Hydro Quebec grid was 
not affected by the power system outage and 
Gentilly-2 continued to operate normally. 

General Observations Based on the Facts 
Found During Phase I 

Following the review of the data provided by the 
Canadian nuclear power plants, the CNWG con¬ 
cludes the following: 

♦ None of the reactor operators had any advanced 
warning of impending collapse of the grid. 

♦ Canadian nuclear power plants did not trigger 
the power system outage or contribute to its 
spread. 
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♦ There were no risks to the health and safety of 
workers or the public as a result of the concur¬ 
rent shutdown of several reactors. Automatic 
safety systems for the turbine generators and 
reactors worked as designed. (See Table 8.3 for 
a summary of shutdown events for Canadian 
nuclear power plants.) 

The CNWG also identified the following second¬ 
ary issues: 

♦ Equipment problems and design limitations at 
Pickering B resulted in a temporary reduction in 
the effectiveness of some of the multiple safety 
barriers, although the equipment failure was 
within the unavailability targets found in the 
OP&Ps approved by the CNSC as part of Ontario 
Power Generation’s license. 

♦ Existing OP&Ps place constraints on the use of 
adjuster rods to respond to events involving 


rapid reductions in reactor power. While 
greater flexibility with respect to use of adjuster 
rods would not have prevented the shutdown, 
some units, particularly those at Darlington, 
might have been able to return to service less 
than 1 hour after the initiating event. 

♦ Off-site power was unavailable for varying peri¬ 
ods of time, from approximately 3 hours at 
Bruce B to approximately 9 hours at Pickering 
A. Despite the high priority assigned by the IMO 
to restoring power to the nuclear stations, the 
stations had some difficulty obtaining timely 
information about the status of grid recovery 
and the restoration of Class IV power. This 
information is important for Ontario Power 
Generation’s and Bruce Power’s response 
strategy. 

♦ Required regulatory approvals from CNSC staff 
were obtained quickly and did not delay the 


Table 8.3. Summary of Shutdown Events for Canadian Nuclear Power Plants 
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V 

(b) 

5 

V 
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V 
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V 
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Darlington NGS 

1 

V 



V 
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2 

V 
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3 
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V 
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V 
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Bruce Nuclear Power 

1 


V 
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2 


V 
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V 
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4 


V 



V 

5 

V 


V 



6 

V 




V 

7 

V 


V 



8 

V 


V 




dickering A Unit 1 tripped as a result of electrical bus configuration immediately prior to the event which resulted in a temporary 
loss of Class II power, 
dickering A Unit 4 also tripped on SDS-E. 

Notes: Unit 7 at Pickering B was operating at low power, warming up prior to reconnecting to the grid after a maintenance outage. 
Unit 4 at Pickering A was producing at low power, as part of the reactor’s commissioning after extensive refurbishment since being 
shut down in 1997. 
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restart of the units; however, CNSC staff was 
unable to immediately activate the CNSC’s 
Emergency Operation Centre because of loss of 
power to the CNSC’s head office building. 
CNSC staff, therefore, established communica¬ 
tions with licensees and the U.S. NRC from 
other locations. 

Regulatory Activities Subsequent to the 
Blackout 

The actuation of emergency shutdown systems at 
Bruce, Darlington and Pickering, and the impair¬ 
ment of the High Pressure Emergency Coolant 
Injection System (HPECIS) at Pickering are events 
for which licensees need to file reports with the 
Canadian Nuclear Safety Commission (CNSC), in 
accordance with Regulatory Standard S 99, 
“Reporting Requirements for Operating Nuclear 
Power Plants.” Reports have been submitted by 
Ontario Power Generation (OPG) and Bruce 
Power, and are being followed up by staff from the 
CNSC as part of the CNSC’s normal regulatory 
process. This includes CNSC’s review and 
approval, where appropriate, of any actions taken 
or proposed to be taken to correct any problems in 
design, equipment or operating procedures identi¬ 
fied by OPG and Bruce Power. 

As a result of further information about the event 
gathered by CNSC staff during followup inspec¬ 
tions, the temporary impairment of the HPECIS at 
Pickering has been rated by CNSC staff as Level 2 
on the International Nuclear Event Scale, indicat¬ 
ing that there was a significant failure in safety 
provisions, but with sufficient backup systems, or 
“defense-in-depth,” in place to cope with potential 
malfunctions. Since August 2003, OPG has imple¬ 
mented procedural and operational changes to 
improve the performance of the safety systems at 
Pickering. 

Conclusions of the Canadian Nuclear 
Working Group 

As discussed above, Canadian nuclear power 
plants did not trigger the power system outage or 
contribute to its spread. The CNWG therefore 
made no recommendations with respect to the 
design or operation of Canadian nuclear plants to 
improve the reliability of the Ontario electricity 
grid. 

The CNWG made two recommendations, one con¬ 
cerning backup electrical generation equipment 
to the CNSC’s Emergency Operations Centre and 


another concerning the use of adjuster rods during 
future events involving the loss of off-site power. 
These are presented in Chapter 10 along with the 
Task Force’s recommendations on other subjects. 

Despite some comments to the contrary, the 
CNWG’s investigation found that the time to 
restart the reactors was reasonable and in line 
with design specifications for the reactors. There¬ 
fore, the CNWG made no recommendations for 
action on this matter. Comments were also made 
regarding the adequacy of generation capacity in 
Ontario and the appropriate mix of technologies 
for electricity generation. This is a matter beyond 
the CNWG’s mandate, and it made no recommen¬ 
dations on this issue. 

Perspective of 

Nuclear Regulatory Agencies 
on Potential Changes to the Grid 

The NRC and the CNSC, under their respective 
regulatory authorities, are entrusted with provid¬ 
ing reasonable assurance of adequate protection of 
public health and safety. As the design and opera¬ 
tion of the electricity grid is taken into account 
when evaluating the safety analysis of nuclear 
power plants, changes to the electricity grid must 
be evaluated for the impact on plant safety. As the 
Task Force final recommendations result in 
actions to affect changes, the NRC and the CNSC 
will assist by evaluating potential effects on the 
safety of nuclear power plant operation. 

The NRC and the CNSC acknowledge that future 
improvements in grid reliability will involve coor¬ 
dination among many groups. The NRC and the 
CNSC intend to maintain the good working rela¬ 
tionships that have been developed during the 
Task Force investigation to ensure that we con¬ 
tinue to share experience and insights and work 
together to maintain an effective and reliable elec¬ 
tric supply system. 

Endnotes 

1 Further details are available in the NRC Special Inspection 
Report dated December 22, 2003, ADAMS Accession No. 
ML033570386. 

2 Further details are available in the NRC Special Inspection 
Report dated December 22, 2003, ADAMS Accession No. 
ML033570386. 

3 Further details are available in the NRC Special Inspection 
Report dated October 10, 2003, ADAMS Acccession No. 
ML032880107. 
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9. Physical and Cyber Security Aspects of the Blackout 


Summary and Primary Findings 

After the Task Force Interim Report was issued in 
November 2003, the Security Working Group 
(SWG) continued in its efforts to investigate 
whether a malicious cyber event directly caused 
or significantly contributed to the power outage of 
August 14, 2003. These efforts included addi¬ 
tional analyses of interviews conducted prior to 
the release of the Interim Report and additional 
consultations with representatives from the elec¬ 
tric power sector. The information gathered from 
these efforts validated the SWG’s Interim Report 
preliminary findings and the SWG found no rea¬ 
son to amend, alter, or negate any of the informa¬ 
tion submitted to the Task Force for the Interim 
Report. 

Specifically, further analysis by the SWG found 
no evidence that malicious actors caused or con¬ 
tributed to the power outage, nor is there evidence 
that worms or viruses circulating on the Internet at 
the time of the power outage had an effect on 
power generation and delivery systems of the 
companies directly involved in the power outage. 
The SWG acknowledges reports of al-Qaeda 
claims of responsibility for the power outage of 
August 14, 2003. However, these claims are not 
consistent with the SWG’s findings. SWG analysis 
also brought to light certain concerns respecting 
the possible failure of alarm software; links to con¬ 
trol and data acquisition software; and the lack of 
a system or process for some grid operators to ade¬ 
quately view the status of electric systems outside 
of their immediate control. 

After the release of the Interim Report in Novem¬ 
ber 2003, the SWG determined that the existing 
data, and the findings derived from analysis of 
those data, provided sufficient certainty to 
exclude the probability that a malicious cyber 
event directly caused or significantly contributed 
to the power outage events. As such, further data 
collection efforts to conduct broader analysis were 
deemed unnecessary. While no additional data 
were collected, further analysis and interviews 


Recommendation 


32, page 163 


conducted after the release of the Interim Report 
allowed the SWG to validate its preliminary find¬ 
ings and the SWG to make recommendations on 
those findings: 

♦ Interviews and analyses conducted by the SWG 
indicate that within some of the companies 
interviewed there are potential opportunities 
for cyber system compromise of Energy Man¬ 
agement Systems (EMS) and their supporting 
information technology (IT) infrastructure. 
Indications of procedural and technical IT man¬ 
agement vulnerabilities were observed in some 
facilities, such as unnecessary software services 
not denied by default, loosely controlled system 
access and perimeter control, poor patch and 
configuration management, and poor system 
security documentation. This situation caused 
the SWG to support the promulgation, imple¬ 
mentation, and enforce¬ 
ment of cyber and physi¬ 
cal security standards for 
the electric power sector. 

♦ A failure in a software program not linked to 
malicious activity may have significantly con¬ 
tributed to the power outage. Since the issuance 
of the Interim Report, the SWG consulted with 
the software program’s vendor and confirmed 
that since the August 14, 2003, power outage, 
the vendor provided industry with the neces¬ 
sary information and mitigation steps to 
address this software failure. In Canada, a sur¬ 
vey was posted on the Canadian Electricity 
Association (CEA) secure members-only web 
site to determine if the 

software was in use. The 
responses indicated that 
it is not used by Canadian 
companies in the industry. 

♦ Internal and external links from Supervisory 
Control and Data Acquisi¬ 
tion (SCADA) networks to 
other systems introduced 
vulnerabilities. 


Recommendation 


33, page 164 


Recommendation 


34, page 165 
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♦ In some cases, Control 
Area (CA) and Reliability 
Coordinator (RC) visibil¬ 
ity into the operations of 
surrounding areas was 
lacking. 



The SWG’s analysis is reflected in a total of 15 rec¬ 
ommendations, two of which were combined with 
similar concerns by the ESWG (Recommendations 
19 and 22); for the remaining 13, see Recommen¬ 
dations 32-44 (pages 163-169). 


Overall, the SWG’s final report was the result of 
interviews conducted with representatives of 
Cinergy, FirstEnergy, American Electric Power 
(AEP), PJM Interconnect, the Midwest Independ¬ 
ent System Operator (MISO), the East Central Area 
Reliability Coordinating Agreement (ECAR), and 
GE Power Systems Division. These entities were 
chosen due to their proximity to the causes of the 
power outage based on the analysis of the Electric 
System Working Group (ESWG). The findings 
contained in this report relate only to those enti¬ 
ties surveyed. The final report also incorporates 
information gathered from third party sources 
as well as federal security and intelligence 
communities. 


In summary, SWG analysis provided no evidence 
that a malicious cyber attack was a direct or indi¬ 
rect cause of the August 14, 2003, power outage. 
This conclusion is supported by the SWG’s event 
timeline, detailed later in this chapter, which 
explains in detail the series of non-malicious 
human and cyber failures that ultimately resulted 
in the power outage. In the course of its analysis 
the SWG, however, did identify a number of areas 
of concern respecting cyber security aspects of the 
electricity sector. 


SWG Mandate and Scope 

It is widely recognized that the increased reliance 
on IT by critical infrastructure sectors, including 
the energy sector, has increased the vulnerability 
of these systems to disruption via cyber means. 
The ability to exploit these vulnerabilities has 
been demonstrated in North America. The SWG 
was comprised of United States and Canadian fed¬ 
eral, state, provincial and local experts in both 
physical and cyber security and its objective was 
to determine the role, if any, that a malicious cyber 
event played in causing, or contributing to, the 
power outage of August 14, 2003. For the purposes 


of its work, the SWG defined a “malicious cyber 
event” as the manipulation of data, software or 
hardware for the purpose of deliberately disrupt¬ 
ing the systems that control and support the gener¬ 
ation and delivery of electric power. 

The SWG worked closely with the United States 
and Canadian law enforcement, intelligence and 
homeland security communities to examine the 
possible role of malicious actors in the power out¬ 
age. A primary activity in this endeavor was the 
collection and review of available intelligence 
related to the power outage of August 14, 2003. 
The SWG also collaborated with the energy indus¬ 
try to examine the cyber systems that control 
power generation and delivery operations, the 
physical security of cyber assets, cyber policies 
and procedures and the functionality of support¬ 
ing infrastructures—such as communication sys¬ 
tems and backup power generation, which 
facilitate the smooth running operation of cyber 
assets—to determine if the operation of these sys¬ 
tems was affected by malicious activity. The SWG 
coordinated its efforts with those of other Working 
Groups and there was a significant interdepen¬ 
dence on each groups work products and findings. 
The SWG’s focus was on the cyber operations of 
those companies in the United States involved in 
the early stages of the power outage timeline, as 
identified by the ESWG. 

Outside of the SWG’s scope was the examination 
of the non-cyber physical infrastructure aspects of 
the power outage of August 14, 2003. The Interim 
Report detailed the SWG’s availability to investi¬ 
gate breaches of physical security unrelated to the 
cyber dimensions of the infrastructure on behalf 
of the Task Force but no incidents came to the 
SWG’s attention during its work. Also outside of 
the scope of the SWG’s work was analysis of the 
impacts the power outage had on other critical 
infrastructure sectors. Both Public Safety and 
Emergency Preparedness Canada and the U.S. 
Department of Homeland Security (DHS) exam¬ 
ined these issues, but not within the context of the 
SWG. 


Cyber Security in the 
Electricity Sector 

The generation and delivery of electricity has 
been, and continues to be, a target of malicious 
groups and individuals intent on disrupting this 
system. Even attacks that do not directly target the 
electricity sector can have disruptive effects on 
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electricity system operations. Many malicious 
code attacks, by their very nature, are unbiased 
and tend to interfere with operations supported by 
vulnerable applications. One such incident 
occurred in January 2003, when the “Slammer” 
Internet worm took down monitoring computers 
at FirstEnergy Corporation’s idled Davis-Besse 
nuclear plant. A subsequent report by the North 
American Electric Reliability Council (NERC) 
concluded that although the infection caused no 
outages, it blocked commands that operated other 
power utilities. 1 

This example, among others, highlights the 
increased vulnerability to disruption via cyber 
means faced by North America’s critical infra¬ 
structure sectors, including the energy sector. Of 
specific concern to the United States and Cana¬ 
dian governments are the SC AD A networks, 
which contain computers and applications that 
perform a wide variety of functions across many 
industries. In electric power, SCADA includes 
telemetry for status and control, as well as EMS, 
protective relaying and automatic generation con¬ 
trol. SCADA systems were developed to maximize 
functionality and interoperability, with little 
attention given to cyber security. These systems, 
many of which were intended to be isolated, now 
find themselves for a variety of business and oper¬ 
ational reasons, either directly or indirectly con¬ 
nected to the global Internet. For example, in some 
instances, there may be a need for employees to 
monitor SCADA systems remotely. However, 
connecting SCADA systems to a remotely accessi¬ 
ble computer network can present security risks. 
These risks include the compromise of sensitive 
operating information and the threat of un¬ 
authorized access to SCADA systems’ control 
mechanisms. 

Security has always been a priority for the electric¬ 
ity sector in North America; however, it is a 
greater priority now than ever before. CAs and RCs 
recognize that the threat environment is changing 
and that the risks are greater than in the past, and 
they have taken steps towards improving their 
security postures. NERC’s Critical Infrastructure 
Protection Advisory Group has been examining 
ways to improve both the physical and cyber secu¬ 
rity dimensions of the North American power 
grid. This group is comprised of Canadian and 
U.S. industry experts in the areas of cyber secu¬ 
rity, physical security and operational security. 
The creation of a national SCADA program is now 
also under discussion in the U.S. to improve the 
physical and cyber security of these control 


systems. The Canadian Electricity Association’s 
Critical Infrastructure Working Group is examin¬ 
ing similar measures. 

Information Collection 
and Analysis 

After analyzing information already obtained 
from stakeholder interviews, telephone tran¬ 
scripts, law enforcement and intelligence informa¬ 
tion, and other ESWG working documents, the 
SWG determined that it was not necessary to ana¬ 
lyze other sources of data on the cyber operations 
of those such as log data from routers, intrusion 
detection systems, firewalls, EMS, change man¬ 
agement logs, and physical security materials. 

The SWG was divided into six sub-teams to 
address the discrete components of this investiga¬ 
tion: Cyber Analysis, Intelligence Analysis, Physi¬ 
cal Analysis, Policies and Procedures, Supporting 
Infrastructure, and Root Cause Liaison. The SWG 
organized itself in this manner to create a holistic 
approach to address each of the main areas of con¬ 
cern with regards to power grid vulnerabilities. 
Rather than analyze each area of concern sepa¬ 
rately, the SWG sub-team structure provided a 
more comprehensive framework in which to 
investigate whether malicious activity was a cause 
of the power outage of August 14, 2003. Each 
sub-team was staffed with Subject Matter Experts 
(SMEs) from government, industry, and academia 
to provide the analytical breadth and depth neces¬ 
sary to complete each sub-team’s objective. A 
detailed overview of the sub-team structure and 
activities for each sub-team is provided below. 

1. Cyber Analysis 

The Cyber Analysis sub-team was led by the 
CERT® Coordination Center (CERT/CC) at Carne¬ 
gie Mellon University and the Royal Canadian 
Mounted Police (RCMP). This team was focused 
on analyzing and reviewing electronic media of 
computer networks in which online communica¬ 
tions take place. The sub-team examined these 
networks to determine if they were maliciously 
used to cause, or contribute to the August 14, 
2003, outage. Specifically, the SWG reviewed 
materials created on behalf of DHS’s National 
Communication System (NCS). These materials 
covered the analysis and conclusions of their 
Internet Protocol (IP) modeling correlation study 
of Blaster (a malicious Internet worm first noticed 
on August 11, 2003) and the power outage. This 
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NCS analysis supports the SWG’s finding that 
viruses and worms prevalent across the Internet 
at the time of the outage did not have any signifi¬ 
cant impact on power generation and delivery 
systems. The team also conducted interviews with 
vendors to identify known system flaws and 
vulnerabilities. 

This sub-team took a number of steps, including 
reviewing NERC reliability standards to gain a 
better understanding of the overall security pos¬ 
ture of the electric power industry. Additionally, 
the sub-team participated in meetings in Balti¬ 
more on August 22 and 23, 2003. The meetings 
provided an opportunity for the cyber experts and 
the power industry experts to understand the 
details necessary to conduct an investigation. 

Members of the sub-team also participated in the 
NERC/Department of Energy (DOE) Fact Finding 
meeting held in Newark, New Jersey on Septem¬ 
ber 8, 2003. Each company involved in the outage 
provided answers to a set of questions related to 
the outage. The meeting helped to provide a better 
understanding of what each company experi¬ 
enced before, during and after the outage. Addi¬ 
tionally, sub-team members participated in 
interviews with grid operators from FirstEnergy 
on October 8 and 9, 2003, and from Cinergy on 
October 10, 2003. 

2. Intelligence Analysis 

The Intelligence Analysis sub-team was led by 
DHS and the RCMP, which worked closely with 
Federal, State and local law enforcement, intelli¬ 
gence and homeland security organizations to 
assess whether the power outage was the result of 
a malicious attack. 

SWG analysis provided no evidence that mali¬ 
cious actors—be they individuals or organiza¬ 
tions—were responsible for, or contributed to, the 
power outage of August 14, 2003. Additionally, 
the sub-team found no indication of deliberate 
physical damage to power generating stations and 
delivery lines on the day of the outage and there 
were no reports indicating the power outage was 
caused by a computer network attack. 

Both U.S. and Canadian government authorities 
provide threat intelligence information to their 
respective energy sectors when appropriate. No 
intelligence reports prior to, during or after the 
power outage indicated any specific terrorist plans 
or operations against the energy infrastructure. 
There was, however, threat information of a 


general nature relating to the sector which was 
provided to the North American energy industry 
by U.S. and Canadian Government agencies in late 
July 2003. This information indicated that 
al-Qaeda might attempt to carry out a physical 
attack involving explosions at oil production facil¬ 
ities, power plants or nuclear plants on the east 
coast of the U.S. during the summer of 2003. The 
type of physical attack described in the intelli¬ 
gence that prompted this threat warning is not 
consistent with the events causing the power out¬ 
age as there was no indication of a kinetic event 
before, during, or immediately after the power 
outage of August 14, 2003. 

Despite all of the above indications that no terror¬ 
ist activity caused the power outage, al-Qaeda 
publicly claimed responsibility for its occurrence: 

♦ August 18, 2003: Al-Hayat, an Egyptian media 
outlet, published excerpts from a communique 
attributed to al-Qaeda. A1 Hayat claimed to have 
obtained the communique from the website of 
the International Islamic Media Center. The 
content of the communique asserts that the “bri¬ 
gades of Abu Fahes A1 Masri had hit two main 
power plants supplying the East of the U.S., as 
well as major industrial cities in the U.S. and 
Canada, ... its ally in the war against Islam 
(New York and Toronto) and their neighbors.” 
Furthermore, the operation “was carried out on 
the orders of Osama bin Laden to hit the pillars 
of the U.S. economy,” as “a realization of bin 
Laden’s promise to offer the Iraqi people a pres¬ 
ent.” The communique does not specify the way 
the alleged sabotage was carried out, but does 
elaborate on the alleged damage the sabotage 
caused to the U.S. economy in the areas of 
finance, transportation, energy and telecommu¬ 
nications. 

Additional claims and commentary regarding the 
power outage appeared in various Middle Eastern 
media outlets: 

♦ August 26, 2003: A conservative Iranian daily 
newspaper published a commentary regarding 
the potential of computer technology as a tool 
for terrorists against infrastructures dependent 
on computer networks, most notably water, 
electric, public transportation, trade organiza¬ 
tions and “supranational” companies in the 
United States. 

♦ September 4, 2003: An Islamist participant in a 
Jihadist chat room forum claimed that sleeper 
cells associated with al-Qaeda used the power 
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outage as a cover to infiltrate the U.S. from 
Canada. 

However, these claims as known are not consis¬ 
tent with the SWG’s findings. They are also not 
consistent with congressional testimony of the 
Federal Bureau of Investigation (FBI). Larry A. 
Mefford, Executive Assistant Director in charge of 
the FBI’s Counterterrorism and Counterintelli¬ 
gence programs, testified in U.S. Congress on Sep¬ 
tember 4, 2003, that: 

“To date, we have not discovered any evidence 
indicating that the outage was a result of activity 
by international or domestic terrorists or other 
criminal activity. ” 2 

Mr. Mefford also testified that: 

“The FBI has received no specific, credible 
threats to electronic power grids in the United 
States in the recent past and the claim of the Abu 
Hafs al-Masri Brigade to have caused the black¬ 
out appears to be no more than wishful thinking. 
We have no information confirming the actual 
existence of this group. ” 3 

Current assessments suggest that there are terror¬ 
ists and other malicious actors who have the capa¬ 
bility to conduct a malicious cyber attack with 
potential to disrupt the energy infrastructure. 
Although such an attack cannot be ruled out 
entirely, an examination of available information 
and intelligence does not support any claims of a 
deliberate attack against the energy infrastructure 
on, or leading up to, August 14, 2003. The few 
instances of physical damage that occurred on 
power delivery lines were the result of natural 
events and not of sabotage. No intelligence reports 
prior to, during or after the power outage indicated 
any specific terrorist plans or operations against 
the energy infrastructure. No incident reports 
detail suspicious activity near the power genera¬ 
tion plants or delivery lines in question. 

3. Physical Analysis 

The Physical Analysis sub-team was led by the 
United States Secret Service and the RCMP. These 
organizations have a particular expertise in physi¬ 
cal security assessments in the energy sector. The 
sub-team focused on issues related to how the 
cyber-related facilities of the energy sector compa¬ 
nies were secured, including the physical integrity 
of data centers and control rooms along with secu¬ 
rity procedures and policies used to limit access to 
sensitive areas. Focusing on the facilities identi¬ 
fied as having a causal relationship to the outage, 


the sub-team sought to determine if the physical 
integrity of these cyber facilities was breached, 
whether externally or by an insider, prior to or 
during the outage, and if so, whether such a 
breach caused or contributed to the power outage. 

Although the sub-team analyzed information pro¬ 
vided to both the ESWG and Nuclear Working 
Groups, the Physical Analysis sub-team also 
reviewed information resulting from face-to-face 
meetings with energy sector personnel and 
site-visits to energy sector facilities to determine 
the physical integrity of the cyber infrastructure. 

The sub-team compiled a list of questions cover¬ 
ing location, accessibility, cameras, alarms, locks, 
fire protection and water systems as they apply to 
computer server rooms. Based on discussions of 
these questions during its interviews, the 
sub-team found no evidence that the physical 
integrity of the cyber infrastructure was breached. 
Additionally, the sub-team examined access and 
control measures used to allow entry into com¬ 
mand and control facilities and the integrity of 
remote facilities. 

The sub-team also concentrated on mechanisms 
used by the companies to report unusual incidents 
within server rooms, command and control rooms 
and remote facilities. The sub-team also addressed 
the possibility of an insider attack on the cyber 
infrastructure. 

4. Policies and Procedures 

The Policies and Procedures sub-team was led by 
DHS and Public Safety and Emergency Prepared¬ 
ness Canada. Personnel from these organizations 
have strong backgrounds in the fields of electric 
delivery operations, automated control systems 
including SC AD A and EMS, and information 
security. 

This sub-team was focused on examining the 
overall policies and procedures that may or may 
not have been in place during the events leading 
up to and during the power outage of August 14, 
2003. Policies that the team examined revolved 
centrally around the cyber systems of the compa¬ 
nies identified in the early stages of the power out¬ 
age. Of specific interest to the team were policies 
and procedures regarding the upgrade and mainte¬ 
nance (to include system patching) of the com¬ 
mand and control (C2) systems, including SCADA 
and EMS. The Policies and Procedures sub-team 
was also interested in the procedures for contin¬ 
gency operations and restoration of systems in the 
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event of a computer system failure, or a cyber 
event such as an active hack or the discovery of 
malicious code. 

5. Supporting Infrastructure 

The Supporting Infrastructure sub-team was led 
by a DHS expert with experience assessing sup¬ 
porting infrastructure elements such as water 
cooling for computer systems, back-up power sys¬ 
tems, heating, ventilation and air conditioning 
(HVAC), and supporting telecommunications net¬ 
works. Public Safety and Emergency Preparedness 
Canada was the Canadian co-lead for this effort. 
This team analyzed the integrity of the supporting 
infrastructure and its role, if any, in the power out¬ 
age on August 14, 2003. It sought to determine 
whether the supporting infrastructure was per¬ 
forming at a satisfactory level leading up to and 
during the power outage of August 14, 2003. In 
addition, the team verified with vendors if there 
were maintenance issues that may have impacted 
operations prior to and during the outage. 

The sub-team specifically focused on the follow¬ 
ing key issues in visits to each of the designated 
electrical entities: 

1. Carrier/provider/vendor for the supporting 
infrastructure services and/or systems at select 
company facilities; 

2. Loss of service before and/or after the power 
outage; 

3. Conduct of maintenance activities before and/or 
after the power outage; 

4. Conduct of installation activities before and/or 
after the power outage; 

5. Conduct of testing activities before and/or after 
the power outage; 

6. Conduct of exercises before and/or after the 
power outage; and 

7. Existence of a monitoring process (log, checklist 
etc.) to document the status of supporting infra¬ 
structure services. 

6. Root Cause Analysis 

The SWG Root Cause Liaison Sub-Team (SWG/ 
RC) followed the work of the ESWG to identify 
potential root causes of the power outage. As these 
root cause elements were identified, the sub-team 
assessed with the ESWG any potential linkages 
to physical and/or cyber malfeasance. The final 
analysis of the SWG/RC team found no causal link 


between the power outage and malicious activity, 
whether physical or cyber initiated. 

Cyber Timeline 

The following sequence of events was derived 
from discussions with representatives of 
FirstEnergy and the Midwest Independent System 
Operator (MISO). All times are approximate. 

The first significant cyber-related event of August 
14, 2003, occurred at 12:40 EDT at the MISO. At 
this time, a MISO EMS engineer purposely dis¬ 
abled the automatic periodic trigger on the State 
Estimator (SE) application, an application that 
allows MISO to determine the real-time state of 
the power system for its region. The disablement 
of the automatic periodic trigger, a program fea¬ 
ture that causes the SE to run automatically every 
five minutes, is a necessary operating procedure 
when resolving a mismatched solution produced 
by the SE. The EMS engineer determined that the 
mismatch in the SE solution was due to the SE 
model depicting Cinergy’s Bloomington-Denois 
Creek 230-kV line as being in service, when it had 
actually been out of service since 12:12 EDT. 

At 13:00 EDT, after making the appropriate 
changes to the SE model and manually triggering 
the SE, the MISO EMS engineer achieved two 
valid solutions. 

At 13:30 EDT, the MISO EMS engineer went to 
lunch. However, he forgot to re-engage the auto¬ 
matic periodic trigger. 

At 14:14 EDT, FirstEnergy’s “Alarm and Event Pro¬ 
cessing Routine,” (AEPR) a key software program 
that gives grid operators visual and audible indica¬ 
tions of events occurring on their portion of the 
grid, began to malfunction. FirstEnergy grid opera¬ 
tors were unaware that the software was not func¬ 
tioning properly. This software did not become 
functional again until much later that evening. 

At 14:40 EDT, an Ops Engineer discovered the SE 
was not solving and went to notify an EMS engi¬ 
neer that the SE was not solving. 

At 14:41 EDT, FirstEnergy’s server running the 
AEPR software failed to the backup server. Control 
room staff remained unaware that the AEPR soft¬ 
ware was not functioning properly. 

At 14:44 EDT, a MISO EMS engineer, after being 
alerted by the Ops Engineer, re-activated the auto¬ 
matic periodic trigger and, for speed, manually 
triggered the program. However, the SE program 
again showed a mismatch. 
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At 14:54 EDT, FirstEnergy’s backup server failed. 
AEPR continued to malfunction. The Area Control 
Error Calculations (ACE) and Strip Charting rou¬ 
tines malfunctioned and the dispatcher user inter¬ 
face slowed significantly. 

At 15:00 EDT, FirstEnergy used its emergency 
backup system to control the system and make 
ACE calculations. ACE calculations and control 
systems continued to run on the emergency 
backup system until roughly 15:08 EDT, when the 
primary server was restored. 

At 15:05 EDT, FirstEnergy’s Harding-Chamberlin 
345-kV line tripped and locked out. FirstEnergy 
grid operators did not receive notification from the 
AEPR software which continued to malfunction, 
unbeknownst to the FirstEnergy grid operators. 

At 15:08 EDT, using data obtained at roughly 
15:04 EDT (it takes roughly five minutes for the SE 
to provide a result), the MISO EMS engineer con¬ 
cluded that the SE mismatched due to a line out¬ 
age. His experience allowed him to isolate the 
outage to the Stuart-Atlanta 345-kV line (which 
tripped about an hour earlier at 14:02 EDT). He 
took the Stuart-Atlanta line out of service in the SE 
model and got a valid solution. 

Also at 15:08 EDT, the FirstEnergy primary server 
was restored. ACE calculations and control sys¬ 
tems were now running on the primary server. 
AEPR continued to malfunction, unbeknownst to 
the FirstEnergy grid operators. 

At 15:09 EDT, the MISO EMS engineer went to 
the control room to tell the grid operators that he 


thought the Stuart-Atlanta line was out of service. 
Grid operators referred to their “Outage Sched¬ 
uler” and informed the EMS Engineer that their 
data showed the Stuart-Atlanta line was “up” and 
that the EMS engineer should depict the line as in 
service in the SE model. At 15:17 EDT, the EMS 
engineer ran the SE with the Stuart-Atlanta line 
“live,” but the model again mismatched. 

At 15:29 EDT, the MISO EMS Engineer asked 
MISO grid operators to call PJM Interconnect, LLC 
to determine the status of the Stuart-Atlanta line. 
MISO was informed that the Stuart-Atlanta line 
tripped at 14:02 EDT. The EMS Engineer adjusted 
the model, which by this time had been updated 
with the 15:05 EDT Harding-Chamberlin 345-kV 
line trip, and came up with a valid solution. 

At 15:32 EDT, FirstEnergy’s Hanna-Juniper 
345-kV line tripped and locked out. The AEPR 
continued to malfunction. 

At 15:41 EDT, the lights flickered at the 
FirstEnergy’s control facility. This occurred 
because they had lost grid power and switched 
over to their emergency power supply. 

At 15:42 EDT, a FirstEnergy dispatcher realized 
that the AEPR was not working and made techni¬ 
cal support staff aware of the problem. 

Endnotes 

1 http://www.nrc.gov/reading-rm/doc-collections/news/ 
2003/ 03-108.h t ml. 

2 http://www.fbi.gov/congress/congress03/mefford090403. 
htm. 

3 http://www.fbi.gov/congress/congress03/mefford090403. 
htm. 
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10. Recommendations to Prevent or Minimize 
the Scope of Future Blackouts 


Introduction 

As reported in previous chapters, the blackout on 
August 14, 2003, was preventable. It had several 
direct causes and contributing factors, including: 

♦ Failure to maintain adequate reactive power 
support 

♦ Failure to ensure operation within secure limits 

♦ Inadequate vegetation management 

♦ Inadequate operator training 

♦ Failure to identify emergency conditions and 
communicate that status to neighboring 
systems 

♦ Inadequate regional-scale visibility over the 
bulk power system. 

Further, as discussed in Chapter 7, after each 
major blackout in North America since 1965, an 
expert team of investigators has probed the causes 
of the blackout, written detailed technical reports, 
and issued lists of recommendations to prevent or 
minimize the scope of future blackouts. Yet sev¬ 
eral of the causes of the August 14 blackout are 
strikingly similar to those of the earlier blackouts. 
Clearly, efforts to implement earlier recommenda¬ 
tions have not been adequate. 1 Accordingly, the 
recommendations presented below emphasize 
comprehensiveness, monitoring, training, and 
enforcement of reliability standards when neces¬ 
sary to ensure compliance. 

It is useful to think of the recommendations pre¬ 
sented below in terms of four broad themes: 

1. Government bodies in the U.S. and Canada, reg¬ 
ulators, the North American electricity indus¬ 
try, and related organizations should commit 
themselves to making adherence to high reli¬ 
ability standards paramount in the planning, 
design, and operation of North America’s vast 


bulk power systems. Market mechanisms 
should be used where possible, but in circum¬ 
stances where conflicts between reliability and 
commercial objectives cannot be reconciled, 
they must be resolved in favor of high reliabil¬ 
ity. 2 

2. Regulators and consumers should recognize 
that reliability is not free, and that maintaining 
it requires ongoing investments and operational 
expenditures by many parties. Regulated com¬ 
panies will not make such outlays without 
assurances from regulators that the costs will be 
recoverable through approved electric rates, 
and unregulated companies will not make such 
outlays unless they believe their actions will be 
profitable. 3 

3. Recommendations have no value unless they 
are implemented. Accordingly, the Task Force 
emphasizes strongly that North American gov¬ 
ernments and industry should commit them¬ 
selves to working together to put into effect the 
suite of improvements mapped out below. Suc¬ 
cess in this area will require particular attention 
to the mechanisms proposed for performance 
monitoring, accountability of senior manage¬ 
ment, and enforcement of compliance with 
standards. 

4. The bulk power systems are among the most 
critical elements of our economic and social 
infrastructure. Although the August 14 black¬ 
out was not caused by malicious acts, a number 
of security-related actions are needed to 
enhance reliability. 

Over the past decade or more, electricity demand 
has increased and the North American intercon¬ 
nections have become more densely woven and 
heavily loaded, over more hours of the day and 
year. In many geographic areas, the number of sin¬ 
gle or multiple contingencies that could create 
serious problems has increased. Operating the 
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grids at higher loadings means greater stress on 
equipment and a smaller range of options and a 
shorter period of time for dealing with unexpected 
problems. The system operator’s job has become 
more challenging, leading to the need for more 
sophisticated grid management tools and more 
demanding operator training programs and certifi¬ 
cation requirements. 

The recommendations below focus on changes of 
many kinds that are needed to ensure reliability, 
for both the summer of 2004 and for the years to 
follow. Making these changes will require higher 
and broader awareness of the importance of reli¬ 
ability, and some of them may require substantial 
new investments. However, the cost of not making 
these changes, i.e., the cost of chronic large-scale 
blackouts, would be far higher than the cost of 
addressing the problem. Estimates of the cost of 
the August 14 blackout range between $4 and $10 
billion (U.S.). 4 

The need for additional attention to reliability is 
not necessarily at odds with increasing competi¬ 
tion and the improved economic efficiency it 
brings to bulk power markets. Reliability and eco¬ 
nomic efficiency can be compatible, but this out¬ 
come requires more than reliance on the laws of 
physics and the principles of economics. It 
requires sustained, focused efforts by regulators, 
policy makers, and industry leaders to strengthen 
and maintain the institutions and rules needed to 
protect both of these important goals. Regulators 
must ensure that competition does not erode 
incentives to comply with reliability require¬ 
ments, and that reliability requirements do not 
serve as a smokescreen for noncompetitive 
practices. 

The metric for gauging achievement of this goal— 
making the changes needed to maintain a high 
level of reliability for the next decade or longer— 
will be the degree of compliance obtained with the 
recommendations presented below. The single 
most important step in the United States is for the 
U.S. Congress to enact the reliability provisions in 
pending energy bills (H.R. 6 and S. 2095). If that 
can be done, many of the actions recommended 
below could be accomplished readily in the 
course of implementing the legislation. 

Some commenters asserted that the Interim 
Report did not analyze all factors they believe may 
have contributed to the August 14 blackout. 


Implementation of the recommendations pre¬ 
sented below will address all remaining issues, 
through the ongoing work of government bodies 
and agencies in the U.S. and Canada, the electric¬ 
ity industry, and the non-governmental institu¬ 
tions responsible for the maintenance of electric 
reliability in North America. 


Recommendations 

Forty-six numbered recommendations are pre¬ 
sented below, grouped into four substantive areas. 
Some recommendations concern subjects that 
were addressed in some detail by commenters on 
the Interim Report or participants in the Task 
Force’s two technical conferences. In such cases, 
the commenters are listed in the Endnotes section 
of this chapter. Citation in the endnotes does not 
necessarily mean that the commenter supports the 
position expressed in the recommendation. A 
“table of contents” overview of the recommenda¬ 
tions is provided in the text box on pages 141-142. 

Group I. Institutional Issues 
Related to Reliability 

1. Make reliability standards mandatory 
and enforceable, with penalties for non- 
compliance. 5 

Appropriate branches of government in the United 
States and Canada should take action as required 
to make reliability standards mandatory and 
enforceable, and to provide appropriate penalties 
for noncompliance. 


A. Action by the U.S. Congress 

The U.S. Congress should enact reliability legisla¬ 
tion no less stringent than the provisions now 
included in the pending comprehensive energy 
bills, H.R. 6 and S. 2095. Specifically, these provi¬ 
sions would require that: 

♦ Reliability standards are to be mandatory and 
enforceable, with penalties for noncompliance. 

♦ Reliability standards should be developed by an 
independent, international electric reliability 
organization (ERO) with fair stakeholder repre¬ 
sentation in the selection of its directors and 
balanced decision-making in any ERO commit¬ 
tee or subordinate organizational structure. 
(See text box on NERC and an ERO below.) 
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Overview of Task Force Recommendations: Titles Only 

Group I. Institutional Issues Related to Reliability 

1. Make reliability standards mandatory and enforceable, with penalties for noncompliance. 

2. Develop a regulator-approved funding mechanism for NERC and the regional reliability councils, 
to ensure their independence from the parties they oversee. 

3. Strengthen the institutional framework for reliability management in North America. 

4. Clarify that prudent expenditures and investments for bulk system reliability (including invest¬ 
ments in new technologies) will be recoverable through transmission rates. 

5. Track implementation of recommended actions to improve reliability. 

6. FERC should not approve the operation of new RTOs or ISOs until they have met minimum 
functional requirements. 

7. Require any entity operating as part of the bulk power system to be a member of a regional reli¬ 
ability council if it operates within the council’s footprint. 

8. Shield operators who initiate load shedding pursuant to approved guidelines from liability or 
retaliation. 

9. Integrate a “reliability impact” consideration into the regulatory decision-making process. 

10. Establish an independent source of reliability performance information. 

11. Establish requirements for collection and reporting of data needed for post-blackout analyses. 

12. Commission an independent study of the relationships among industry restructuring, competi¬ 
tion, and reliability. 

13. DOE should expand its research programs on reliability-related tools and technologies. 

14. Establish a standing framework for the conduct of future blackout and disturbance 
investigations. 

Group II. Support and Strengthen NERC’s Actions of February 10, 2004 

15. Correct the direct causes of the August 14, 2003 blackout. 

16. Establish enforceable standards for maintenance of electrical clearances in right-of-way areas. 

17. Strengthen the NERC Compliance Enforcement Program. 

18. Support and strengthen NERC’s Reliability Readiness Audit Program. 

19. Improve near-term and long-term training and certification requirements for operators, reliability 
coordinators, and operator support staff. 

20. Establish clear definitions for normal, alert and emergency operational system conditions. Clarify 
roles, responsibilities, and authorities of reliability coordinators and control areas under each 
condition. 

21. Make more effective and wider use of system protection measures. 

22. Evaluate and adopt better real-time tools for operators and reliability coordinators. 

23. Strengthen reactive power and voltage control practices in all NERC regions. 

24. Improve quality of system modeling data and data exchange practices. 

25. NERC should reevaluate its existing reliability standards development process and accelerate the 
adoption of enforceable standards. 

26. Tighten communications protocols, especially for communications during alerts and emergen¬ 
cies. Upgrade communication system hardware where appropriate. 

27. Develop enforceable standards for transmission line ratings. 

28. Require use of time-synchronized data recorders. 

29. Evaluate and disseminate lessons learned during system restoration. 

30. Clarify criteria for identification of operationally critical facilities, and improve dissemination of 
updated information on unplanned outages. 

31. Clarify that the transmission loading relief (TLR) process should not be used in situations involv¬ 
ing an actual violation of an Operating Security Limit. Streamline the TLR process. 

(continued on page 142) 
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Overview of Task Force Recommendations: Titles Only (Continued) 

Group III. Physical and Cyber Security of North American Bulk Power Systems 

32. Implement NERC IT standards. 

33. Develop and deploy IT management procedures. 

34. Develop corporate-level IT security governance and strategies. 

35. Implement controls to manage system health, network monitoring, and incident management. 

36. Initiate U.S.-Canada risk management study. 

37. Improve IT forensic and diagnostic capabilities. 

38. Assess IT risk and vulnerability at scheduled intervals. 

39. Develop capability to detect wireless and remote wireline intrusion and surveillance. 

40. Control access to operationally sensitive equipment. 

41. NERC should provide guidance on employee background checks. 

42. Confirm NERC ES-ISAC as the central point for sharing security information and analysis. 

43. Establish clear authority for physical and cyber security. 

44. Develop procedures to prevent or mitigate inappropriate disclosure of information. 

Group IV. Canadian Nuclear Power Sector 

45. The Task Force recommends that the Canadian Nuclear Safety Commission request Ontario 
Power Generation and Bruce Power to review operating procedures and operator training associ¬ 
ated with the use of adjuster rods. 

46. The Task Force recommends that the Canadian Nuclear Safety Commission purchase and install 
backup generation equipment. 


♦ Reliability standards should allow, where 
appropriate, flexibility to accommodate 
regional differences, including more stringent 
reliability requirements in some areas, but 
regional deviations should not be allowed to 
lead to lower reliability expectations or 
performance. 

♦ An ERO-proposed standard or modification to a 
standard should take effect within the United 
States upon approval by the Federal Energy 
Regulatory Commission (FERC). 

♦ FERC should remand to the ERO for further 
consideration a proposed reliability standard or 
a modification to a reliability standard that it 
disapproves of in whole or in part, with expla¬ 
nation for its concerns and rationale. 

B. Action by FERC 

In the absence of such reliability legislation, FERC 
should review its statutory authorities under 
existing law, and to the maximum extent permit¬ 
ted by those authorities, act to enhance reliability 
by making compliance with reliability standards 
enforceable in the United States. In doing so, 
FERC should consult with state regulators, NERC, 
and the regional reliability councils to determine 
whether certain enforcement practices now in use 
in some parts of the U.S. and Canada might be 


applied more broadly. For example, in the 
Western U.S. and Canada, many members of the 
Western Electricity Coordinating Council (WECC) 
include clauses in contracts for the purchase of 
wholesale power that require the parties to com¬ 
ply with reliability standards. In the areas of the 
U.S. and Canada covered by the Northeast Power 
Coordinating Council (NPCC), parties found not to 
be in compliance with NERC and NPCC reliability 
requirements are subject to escalating degrees of 
scrutiny by their peers and the public. Both of 
these approaches have had positive effects. FERC 
should examine other approaches as well, and 
work with state regulatory authorities to ensure 


NERC and the ERO 

If the proposed U.S. reliability legislation 
passes, the North American Electric Reliability 
Council (NERC) may undertake various organi¬ 
zational changes and seek recognition as the 
electric reliability organization (ERO) called for 
in H.R. 6 and S. 2095. For simplicity of presen¬ 
tation, the many forward-looking references 
below to “NERC” are intended to apply to the 
ERO if the legislation is passed, and to NERC if 
the legislation is not passed. 
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based on a surcharge in transmission rates. The 
purpose would be to ensure that NERC and the 
councils are appropriately funded to meet their 
changing responsibilities without dependence on 
the parties that they oversee. Note: Implementation 
of this recommendation should be coordinated 
with the review called for in Recommendation 3 
concerning the future role of the regional councils. 


that any other appropriate actions to make reli¬ 
ability standards enforceable are taken. 

Action by FERC under its existing authorities 
would not lessen the need for enactment of reli¬ 
ability legislation by the Congress. Many U.S. par¬ 
ties that should be required by law to comply with 
reliability requirements are not subject to the 
Commission’s full authorities under the Federal 
Power Act. 

C. Action by Appropriate Authorities in Canada 

The interconnected nature of the transmission 
grid requires that reliability standards be identical 
or compatible on both sides of the Canadian/U.S. 
border. Several provincial governments in Canada 
have already demonstrated support for mandatory 
and enforceable reliability standards and have 
either passed legislation or have taken steps to put 
in place the necessary framework for implement¬ 
ing such standards in Canada. The federal and 
provincial governments should work together and 
with appropriate U.S. authorities to complete a 
framework to ensure that identical or compatible 
standards apply in both countries, and that means 
are in place to enforce them in all interconnected 
jurisdictions. 

D. Joint Actions by U.S. and Canadian 
Governments 

International coordination mechanisms should be 
developed between the governments in Canada 
and the United States to provide for government 
oversight of NERC or the ERO, and approval and 
enforcement of reliability standards. 

E. Memoranda of Understanding between U.S. 
or Canadian Government Agencies and 
NERC 

Government agencies in both countries should 
decide (individually) whether to develop a memo¬ 
randum of understanding (MOU) with NERC that 
would define the agency’s working relationship 
with NERC, government oversight of NERC activi¬ 
ties if appropriate, and the reliability responsibili¬ 
ties of the signatories. 


2. Develop a regulator-approved mecha¬ 
nism for funding NERC and the regional 
reliability councils, to ensure their inde¬ 
pendence from the parties they oversee. 6 

U.S. and Canadian regulatory authorities should 
work with NERC, the regional councils, and the 
industry to develop and implement a new funding 
mechanism for NERC and the regional councils 


NERC’s current $13 million/year budget is funded 
as part of the dues that transmission owners, gen¬ 
erators, and other market participants pay to the 
ten regional reliability councils, which then fund 
NERC. This arrangement makes NERC subject to 
the influence of the reliability councils, which are 
in turn subject to the influence of their control 
areas and other members. It also compromises the 
independence of both NERC and the councils in 
relation to the entities whose actions they oversee, 
and makes it difficult for them to act forcefully 
and objectively to maintain the reliability of the 
North American bulk power system. Funding 
NERC and the councils through a transmission 
rate surcharge administered and disbursed under 
regulatory supervision would enable the organiza¬ 
tions to be more independent of the industry, with 
little impact on electric bills. The dues that com¬ 
panies pay to the regional councils are passed 
through to electricity customers today, so the net 
impacts on customer bills from shifting to a rate 
surcharge would be minimal. 

Implementation of the recommendations pre¬ 
sented in this report will involve a substantial 
increase in NERC’s functions and responsibilities, 
and require an increase in NERC’s annual budget. 
The additional costs, however, would be small in 
comparison to the cost of a single major blackout. 


3. Strengthen the institutional framework 
for reliability management in North 
America. 7 

FERC, DOE and appropriate authorities in Canada 
should work with the states, NERC, and the indus¬ 
try, to evaluate and develop appropriate modifica¬ 
tions to the existing institutional framework for 
reliability management. In particular, the affected 
government agencies should: 

A. Commission an independent review by quali¬ 
fied experts in organizational design and man¬ 
agement to address issues concerning how best 
to structure an international reliability organi¬ 
zation for the long term. 
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B. Based in part on the results of that review, 
develop metrics for gauging the adequacy of 
NERC’s performance, and specify the functions 
of the NERC Board of Trustees and the proce¬ 
dure for selecting the members of the Board. 

C. Examine and clarify the future role of the 
regional reliability councils, with particular 
attention to their mandate, scope, structure, 
responsibilities, and resource requirements. 

D. Examine NERC’s proposed Functional Model 
and set minimum requirements under which 
NERC would certify applicants’ qualifications 
to perform critical functions. 

E. Request NERC and the regional councils to sus¬ 
pend designation of any new control areas (or 
sub-control areas) until the minimum require¬ 
ments in section D (above) have been estab¬ 
lished, unless an applicant shows that such 
designation would significantly enhance reli¬ 
ability. 

F. Determine ways to enhance reliability opera¬ 
tions in the United States through simplified 
organizational boundaries and resolution of 
seams issues. 


A and B. Reshaping NERC 

The far-reaching organizational changes in the 
North American electricity industry over the past 
decade have already induced major changes in the 
nature of NERC as an organization. However, the 
process of change at NERC is far from complete. 
Important additional changes are needed such as 
the shift to enforceable standards, development of 
an effective monitoring capability, and funding 
that is not dependent on the industry. These 
changes will strengthen NERC as an organization. 
In turn, to properly serve overarching public pol¬ 
icy concerns, this strengthening of NERC’s capa¬ 
bilities will have to be balanced with increased 
government oversight, more specific metrics for 
gauging NERC’s performance as an organization, 
and greater transparency concerning the functions 
of its senior management team (including its 
Board of Trustees) and the procedures by which 
those individuals are selected. The affected gov¬ 
ernment agencies should jointly commission an 
independent review of these and related issues to 
aid them in making their respective decisions. 

C. The Role of the Regional Reliability Councils 

North America’s regional reliability councils have 
evolved into a disparate group of organizations 
with varying responsibilities, expertise, roles, 


sizes and resources. Some have grown from a reli¬ 
ability council into an ISO or RTO (ERCOT and 
SPP), some span less than a single state (FRCC and 
ERCOT) while others cover many states and prov¬ 
inces and cross national boundaries (NPCC and 
WECC). Several cross reliability coordinator 
boundaries. It is time to evaluate the appropriate 
size and scope of a regional council, the specific 
tasks that it should perform, and the appropriate 
level of resources, expertise, and independence 
that a regional reliability council needs to perform 
those tasks effectively. This evaluation should 
also address whether the councils as currently 
constituted are appropriate to meet future reliabil¬ 
ity needs. 

D. NERC’s Functional Model 

The transition to competition in wholesale power 
markets has been accompanied by increasing 
diversity in the kinds of entities that need to be in 
compliance with reliability standards. Rather than 
resist or attempt to influence this evolution, 
NERC’s response—through the Functional 
Model—has been to seek a means of enabling reli¬ 
ability to be maintained under virtually any insti¬ 
tutional framework. The Functional Model 
identifies sixteen basic functions associated with 
operating the bulk electric systems and maintain¬ 
ing reliability, and the capabilities that an organi¬ 
zation must have in order to perform a given 
function. (See Functional Model text box below.) 

NERC acknowledges that maintaining reliability 
in some frameworks may be more difficult or more 
expensive than in others, but it stresses that as 
long as some responsible party addresses each 
function and the rules are followed, reliability will 
be preserved. By implication, the pros and cons of 
alternative institutional frameworks in a given 
region—which may affect aspects of electric 
industry operations other than reliability—are 
matters for government agencies to address, not 
NERC. 

One of the major purposes of the Functional 
Model is to create a vehicle through which NERC 
will be able to identify an entity responsible for 
performing each function in every part of the three 
North American interconnections. NERC consid¬ 
ers four of the sixteen functions to be especially 
critical for reliability. For these functions, NERC 
intends, upon application by an entity, to review 
the entity’s capabilities, and if appropriate, certify 
that the entity has the qualifications to perform 
that function within the specified geographic area. 
For the other twelve functions, NERC proposes to 
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“register” entities as responsible for a given func¬ 
tion in a given area, upon application. 

All sixteen functions are presently being per¬ 
formed to varying degrees by one entity or another 
today in all areas of North America. Frequently an 
entity performs a combination of functions, but 
there is great variety from one region to another in 
how the functions are bundled and carried out. 
Whether all of the parties who are presently per¬ 
forming the four critical functions would meet 
NERC’s requirements for certification is not 
known, but the proposed process provides a 
means of identifying any weaknesses that need to 
be rectified. 

At present, after protracted debate, the Functional 
Model appears to have gained widespread but cau¬ 
tious support from the diverse factions across the 
industry, while the regulators have not taken a 
position. In some parts of North America, such as 
the Northeast, large regional organizations will 
probably be certified to perform all four of the 

Sixteen Functions in NERC’s Functional 
Model 

♦ Operating Reliability 

♦ Pl a nning Reliability 

♦ Balancing (generation and demand) 

♦ Interchange 

♦ Transmission service 

♦ Transmission ownership 

♦ Transmission operations 

♦ Transmission planning 

♦ Resource planning 

♦ Distribution 

♦ Generator ownership 

♦ Generator operations 

♦ Load serving 

♦ Purchasing and selling 

♦ Standards development 

♦ Compliance monitoring 

NERC regards the four functions shown above 
in bold as especially critical to reliability. 
Accordingly, it proposes to certify applicants 
that can demonstrate that they have the capabil¬ 
ities required to perform those functions. The 
Operating Reliability authority would corre¬ 
spond to today’s reliability coordinator, and the 
Balancing authority to today’s control area 
operator. 


critical functions for their respective areas. In 
other areas, capabilities may remain less aggre¬ 
gated, and the institutional structure may remain 
more complex. 

Working with NERC and the industry, FERC and 
authorities in Canada should review the Func¬ 
tional Model to ensure that operating hierarchies 
and entities will facilitate, rather than hinder, 
efficient reliability operations. At a minimum, 
the review should identify ways to eliminate inap¬ 
propriate commercial incentives to retain control 
area status that do not support reliability objec¬ 
tives; address operational problems associated 
with institutional fragmentation; and set mini¬ 
mum requirements with respect to the capabilities 
requiring NERC certification, concerning subjects 
such as: 

1. Fully operational backup control rooms. 

2. System-wide (or wider) electronic map boards 
or functional equivalents, with data feeds that 
are independent of the area’s main energy man¬ 
agement system (EMS). 

3. Real-time tools that are to be available to the 
operator, with backups. (See Recommendation 
22 below for more detail concerning minimum 
requirements and guidelines for real-time oper¬ 
ating tools.) 

4. SCADA and EMS requirements, including 
backup capabilities. 

5. Training programs for all personnel who have 
access to a control room or supervisory respon¬ 
sibilities for control room operations. (See Rec¬ 
ommendation 19 for more detail on the Task 
Force’s views regarding training and certifica¬ 
tion requirements.) 

6. Certification requirements for control room 
managers and staff. 

E. Designation of New Control Areas 

Significant changes in the minimum functional 
requirements for control areas (or balancing 
authorities, in the context of the Functional 
Model) may result from the review called for 
above. Accordingly, the Task Force recommends 
that regulatory authorities should request NERC 
and the regional councils not to certify any new 
control areas (or sub-control areas) until the 
appropriate regulatory bodies have approved the 
minimum functional requirements for such bod¬ 
ies, unless an applicant shows that such designa¬ 
tion would significantly enhance reliability. 
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FERC and appropriate authorities in Canada 
should clarify that prudent expenditures and 
investments by regulated companies to maintain or 
improve bulk system reliability will be recoverable 
through transmission rates. 

In the U.S., FERC and DOE should work with state 
regulators to identify and resolve issues related to 
the recovery of reliability costs and investments 
through retail rates. Appropriate authorities in 
Canada should determine whether similar efforts 
are warranted. 


F. Boundary and Seam Issues and Minimum 
Functional Requirements 

Some observers believe that some U.S. regions 
have too many control areas performing one or 
more of the four critical reliability functions. 
In many cases, these entities exist to retain com¬ 
mercial advantages associated with some of these 
functions. The resulting institutional fragmenta¬ 
tion and decentralization of control leads to a 
higher number of operating contacts and seams, 
complex coordination requirements, misalign¬ 
ment of control areas with other electrical bound¬ 
aries and/or operating hierarchies, inconsistent 
practices and tools, and increased compliance 
monitoring requirements. These consequences 
hamper the efficiency and reliability of grid 
operations. 

As shown above (text box on page 14), MISO, as 
reliability coordinator for its region, is responsible 
for dealing with 37 control areas, whereas PJM 
now spans 9 control areas, ISO-New England has 
2, and the New York ISO, Ontario’s IMO, Texas’ 
ERCOT, and Quebec’s Trans-Energie are them¬ 
selves the control area operators for their respec¬ 
tive large areas. Moreover, it is not clear that small 
control areas are financially able to provide the 
facilities and services needed to perform control 
area functions at the level needed to maintain reli¬ 
ability. This concern applies also to the four types 
of entities that NERC proposes to certify under the 
Functional Model (i.e., Reliability Authority, 
Planning Authority, Balancing Authority, and 
Interchange Authority). 

For the long term, the regulatory agencies should 
continue to seek ways to ensure that the regional 
operational frameworks that emerge through the 
implementation of the Functional Model promote 
reliable operations. Any operational framework 
will represent some combination of tradeoffs, but 
reliability is a critically important public policy 
objective and should be a primary design 
criterion. 


Companies will not make the expenditures and 
investments required to maintain or improve the 
reliability of the bulk power system without credi¬ 
ble assurances that they will be able to recover 
their costs. 


5. Track implementation of recommended 
actions to improve reliability. 9 

In the requirements issued on February 10, 2004, 
NERC announced that it and the regional councils 
would establish a program for documenting com¬ 
pletion of recommendations resulting from the 
August 14 blackout and other historical outages, as 
well as NERC and regional reports on violations of 
reliability standards, results of compliance audits, 
and lessons learned from system disturbances. The 
regions are to report on a quarterly basis to NERC. 

In addition, NERC intends to initiate by January 1, 
2005 a reliability performance monitoring function 
that will evaluate and report on trends in bulk 
electric system reliability performance. 

The Task Force supports these actions strongly. 
However, many of the Task Force’s recommenda¬ 
tions pertain to government bodies as well as 
NERC. Accordingly: 

A. Relevant agencies in the U.S. and Canada 
should cooperate to establish mechanisms for 
tracking and reporting to the public on imple¬ 
mentation actions in their respective areas of 
responsibility. 

B. NERC should draw on the above-mentioned 
quarterly reports from its regional councils to 
prepare annual reports to FERC, appropriate 
authorities in Canada, and the public on the 
status of the industry’s compliance with recom¬ 
mendations and important trends in electric 
system reliability performance. 


The August 14 blackout shared a number of con¬ 
tributing factors with prior large-scale blackouts, 


4. Clarify that prudent expenditures and 
investments for bulk system reliability 
(including investments in new technolo¬ 
gies) will be recoverable through trans¬ 
mission rates. 8 
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confirming that the lessons and recommendations 
from earlier blackouts had not been adequately 
implemented, at least in some geographic areas. 
Accordingly, parallel and coordinated efforts are 
needed by the relevant government agencies and 
NERC to track the implementation of recommen¬ 
dations by governments and the electricity indus¬ 
try. WECC and NPCC have already established 
programs that could serve as models for tracking 
implementation of recommendations. 


6. FERC should not approve the operation 
of a new RTO or ISO until the applicant 
has met the minimum functional 
requirements for reliability 
coordinators. _ 

The events of August 14 confirmed that MISO did 
not yet have all of the functional capabilities 
required to fulfill its responsibilities as reliability 
coordinator for the large area within its footprint. 
FERC should not authorize a new RTO or ISO to 
become operational until the RTO or ISO has veri¬ 
fied that all critical reliability capabilities will be 
functional upon commencement of RTO or ISO 
operations. 


7. Require any entity operating as part of 
the bulk power system to be a member 
of a regional reliability council if it op¬ 
erates within the council’s footprint. 10 

The Task Force recommends that FERC and appro¬ 
priate authorities in Canada be empowered 
through legislation, if necessary, to require all enti¬ 
ties that operate as part of the bulk electric system 
to certify that they are members of the regional 
reliability council for all NERC regions in which 
they operate. 

This requirement is needed to ensure that all rele¬ 
vant parties are subject to NERC standards, poli¬ 
cies, etc., in all NERC regions in which they 
operate. Action by the Congress or legislative bod¬ 
ies in Canada may be necessary to provide appro¬ 
priate authority. 


8. Shield operators who initiate load shed¬ 
ding pursuant to approved guidelines 
from liability or retaliation. 11 

Legislative bodies and regulators should: 1) estab¬ 
lish that operators (whether organizations or indi¬ 
viduals) who initiate load shedding pursuant to 
operational guidelines are not subject to liability 


suits; and 2) affirm publicly that actions to shed 
load pursuant to such guidelines are not indicative 
of operator failure. 

Timely and sufficient action to shed load on 
August 14 would have prevented the spread of the 
blackout beyond northern Ohio. NERC has 
directed all the regional councils in all areas of 
North America to review the applicability of plans 
for under-voltage load shedding, and to support 
the development of such capabilities where they 
would be beneficial. However, organizations and 
individual operators may hesitate to initiate such 
actions in appropriate circumstances without 
assurances that they will not be subject to liability 
suits or other forms of retaliation, provided their 
action is pursuant to previously approved 
guidelines. 


9. Integrate a “reliability impact” consid¬ 
eration into the regulatory decision¬ 
making process. 12 

The Task Force recommends that FERC, appropri¬ 
ate authorities in Canada, and state regulators inte¬ 
grate a formal reliability impact consideration into 
their regulatory decision-making to ensure that 
their actions or initiatives either improve or at 
minimum do no harm to reliability. 

Regulatory actions can have unintended conse¬ 
quences. For example, in reviewing proposed util¬ 
ity company mergers, FERC’s primary focus has 
been on financial and rate issues, as opposed to 
the reliability implications of such mergers. To 
minimize unintended harm to reliability, and aid 
the improvement of reliability where appropriate, 
the Task Force recommends that regulators incor¬ 
porate a formal reliability impact consideration 
into their decision processes. At the same time, 
regulators should be watchful for use of alleged 
reliability impacts as a smokescreen for anti¬ 
competitive or discriminatory behavior. 


10. Establish an independent source of 
reliability performance information. 13 

The U.S. Department of Energy’s Energy Informa¬ 
tion Administration (EIA), in coordination with 
other interested agencies and data sources (FERC, 
appropriate Canadian government agencies, NERC, 
RTOs, ISOs, the regional councils, transmission 
operators, and generators) should establish com¬ 
mon definitions and information collection stan¬ 
dards. If the necessary resources can be identified, 
EIA should expand its current activities to include 
information on reliability performance. 


❖ U.S.-Canada Power System Outage Task Force -O August 14th Blackout: Causes and Recommendations -O 


147 
















Energy policy makers and a wide range of eco¬ 
nomic decision makers need objective, factual 
information about basic trends in reliability per¬ 
formance. EIA and the other organizations cited 
above should identify information gaps in federal 
data collections covering reliability performance 
and physical characteristics. Plans to fill those 
gaps should be developed, and the associated 
resource requirements determined. Once those 
resources have been acquired, EIA should publish 
information on trends, patterns, costs, etc. related 
to reliability performance. 


11. Establish requirements for collection 
and reporting of data needed for 
post-blackout analyses. 

FERC and appropriate authorities in Canada 
should require generators, transmission owners, 
and other relevant entities to collect and report 
data that may be needed for analysis of blackouts 
and other grid-related disturbances. 

The investigation team found that some of the data 
needed to analyze the August 14 blackout fully 
was not collected at the time of the events, and 
thus could not be reported. Some of the data that 
was reported was based on incompatible defini¬ 
tions and formats. As a result, there are aspects of 
the blackout, particularly concerning the evolu¬ 
tion of the cascade, that may never be fully 
explained. FERC, EIA and appropriate authorities 
in Canada should consult with NERC, key mem¬ 
bers of the investigation team, and the industry to 
identify information gaps, adopt common defini¬ 
tions, and establish filing requirements. 


12. Commission an independent study of 
the relationships among industry 
restructuring, competition, and reli¬ 
ability. 14 

DOE and Natural Resources Canada should com¬ 
mission an independent study of the relationships 
among industry restructuring, competition in 
power markets, and grid reliability, and how those 
relationships should be managed to best serve the 
public interest. 

Some participants at the public meetings held in 
Cleveland, New York and Toronto to review the 
Task Force’s Interim Report expressed the view 
that the restructuring of electricity markets for 
competition in many jurisdictions has, itself, 
increased the likelihood of major supply interrup¬ 
tions. Some of these commenters assert that the 


transmission system is now being used to transmit 
power over distances and at volumes that were not 
envisioned when the system was designed, and 
that this functional shift has created major risks 
that have not been adequately addressed. Indeed, 
some commenters believe that restructuring was a 
major cause of the August 14 blackout. 

The Task Force believes that the Interim Report 
accurately identified the primary causes of the 
blackout. It also believes that had existing reliabil¬ 
ity requirements been followed, either the distur¬ 
bance in northern Ohio that evolved on August 14 
into a blackout would not have occurred, or it 
would have been contained within the FE control 
area. 

Nevertheless, as discussed at the beginning of this 
chapter, the relationship between competition in 
power markets and reliability is both important 
and complex, and careful management and sound 
rules are required to achieve the public policy 
goals of reasonable electricity prices and high reli¬ 
ability. At the present stage in the evolution of 
these markets, it is worthwhile for DOE and Natu¬ 
ral Resources Canada (in consultation with FERC 
and the Canadian Council of Energy Ministers) to 
commission an independent expert study to pro¬ 
vide advice on how to achieve and sustain an 
appropriate balance in this important area. 

Among other things, this study should take into 
account factors such as: 

♦ Historical and projected load growth 

♦ Location of new generation in relation to old 
generation and loads 

♦ Zoning and NIMBY 15 constraints on siting of 
generation and transmission 

♦ Lack of new transmission investment and its 
causes 

♦ Regional comparisons of impact of wholesale 
electric competition on reliability performance 
and on investments in reliability and 
transmission 

♦ The financial community’s preferences and 
their effects on capital investment patterns 

♦ Federal vs. state jurisdictional concerns 

♦ Impacts of state caps on retail electric rates 

♦ Impacts of limited transmission infrastructure 
on energy costs, transmission congestion, and 
reliability 


148 O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations *0 






♦ Trends in generator fuel and wholesale electric¬ 
ity prices 

♦ Trends in power flows, line losses, voltage lev¬ 
els, etc. 


13. DOE should expand its research pro¬ 
grams on reliability-related tools and 
technologies. 16 

DOE should expand its research agenda, and con¬ 
sult frequently with Congress, FERC, NERC, state 
regulators, Canadian authorities, universities, and 
the industry in planning and executing this agenda. 

More investment in research is needed to improve 
grid reliability, with particular attention to 
improving the capabilities and tools for system 
monitoring and management. Research on reli¬ 
ability issues and reliability-related technologies 
has a large public-interest component, and gov¬ 
ernment support is crucial. DOE already leads 
many research projects in this area, through part¬ 
nerships with industry and research under way at 
the national laboratories and universities. DOE’s 
leadership and frequent consultation with many 
parties are essential to ensure the allocation of 
scarce research funds to urgent projects, bring the 
best talent to bear on such projects, and enhance 
the dissemination and timely application of 
research results. 

Important areas for reliability research include but 
are not limited to: 

♦ Development of practical real-time applications 
for wide-area system monitoring using phasor 
measurements and other synchronized measur¬ 
ing devices, including post-disturbance 
applications. 

♦ Development and use of enhanced techniques 
for modeling and simulation of contingencies, 
blackouts, and other grid-related disturbances. 

♦ Investigation of protection and control alterna¬ 
tives to slow or stop the spread of a cascading 
power outage, including demand response ini¬ 
tiatives to slow or halt voltage collapse. 

♦ Re-evaluation of generator and customer equip¬ 
ment protection requirements based on voltage 
and frequency phenomena experienced during 
the August 14, 2003, cascade. 

♦ Investigation of protection and control of gener¬ 
ating units, including the possibility of multiple 
steps of over-frequency protection and possible 


effects on system stability during major 
disturbances. 

♦ Development of practical human factors guide¬ 
lines for power system control centers. 

♦ Study of obstacles to the economic deployment 
of demand response capability and distributed 
generation. 

♦ Investigation of alternative approaches to moni¬ 
toring right-of-way vegetation management. 

♦ Study of air traffic control, the airline industry, 
and other relevant industries for practices and 
ideas that could reduce the vulnerability of the 
electricity industry and its reliability managers 
to human error. 

Cooperative and complementary research and 

funding between nations and between govern¬ 
ment and industry efforts should be encouraged. 


14. Establish a standing framework for the 
conduct of future blackout and distur¬ 
bance investigations. 17 

The U.S., Canadian, and Mexican governments, in 
consultation with NERC, should establish a stand¬ 
ing framework for the investigation of future black¬ 
outs, disturbances, or other significant grid-related 
incidents. _ 

Fortunately, major blackouts are not frequent, 
which makes it important to study such events 
carefully to learn as much as possible from the 
experience. In the weeks immediately after 
August 14, important lessons were learned per¬ 
taining not only to preventing and minimizing 
future blackouts, but also to the efficient and fruit¬ 
ful investigation of future grid-related events. 

Appropriate U.S., Canadian, and Mexican govern¬ 
ment agencies, in consultation with NERC and 
other organizations, should prepare an agreement 
that, among other considerations: 

♦ Establishes criteria for determining when an 
investigation should be initiated. 

♦ Establishes the composition of a task force to 
provide overall guidance for the inquiry. The 
task force should be international if the trigger¬ 
ing event had international consequences. 

♦ Provides for coordination with state and provin¬ 
cial governments, NERC and other appropriate 
entities. 
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♦ Designates agencies responsible for issuing 
directives concerning preservation of records, 
provision of data within specified periods to a 
data warehouse facility, conduct of onsite inter¬ 
views with control room personnel, etc. 

♦ Provides guidance on confidentiality of data. 

♦ Identifies types of expertise likely to be needed 
on the investigation team. 

Group II. Support and Strengthen 
NERC’s Actions of February 10, 2004 

On February 10, 2004, after taking the findings of 
the Task Force’s investigation into the August 14, 
2003, blackout into account, the NERC Board of 
Trustees approved a series of actions and strategic 
and technical initiatives intended to protect the 
reliability of the North American bulk electric sys¬ 
tem. (See Appendix D for the full text of the 
Board’s statement of February 10.) Overall, the 
Task Force supports NERC’s actions and initia¬ 
tives strongly. On some subjects, the Task Force 
advocates additional measures, as shown in the 
next 17 recommendations. 


15. Correct the direct causes of the 
August 14, 2003 blackout. 18 

NERC played an important role in the Task Force’s 
blackout investigation, and as a result of the find¬ 
ings of the investigation, NERC issued directives on 
February 10, 2004 to FirstEnergy, MISO, and PJM 
to complete a series of remedial actions by June 30, 
2004 to correct deficiencies identified as factors 
contributing to the blackout of August 14, 2003. 

(For specifics on the actions required by NERC, see 
Appendix D.) 

The Task Force supports and endorses NERC’s 
near-term requirements strongly. It recommends 
the addition of requirements pertaining to ECAR, 
and several other additional elements, as described 
below. 


A. Corrective Actions to Be Completed by 
FirstEnergy by June 30, 2004 

The full text of the remedial actions NERC has 
required that FirstEnergy (FE) complete by June 30 
is provided in Appendix D. The Task Force recom¬ 
mends the addition of certain elements to these 
requirements, as described below. 

1. Examination of Other FE Service Areas 

The Task Force’s investigation found severe reac¬ 
tive power and operations criteria deficiencies in 
the Cleveland-Akron area. 


NERC: 

Specified measures required in that area to 
help ensure the reliability of the FE system and 
avoid undue risks to neighboring systems. 
However, the blackout investigation did not ex¬ 
amine conditions in FE service areas in other 
states. 

Task Force: 

Recommends that NERC require FE to review 
its entire service territory, in all states, to de¬ 
termine whether similar vulnerabilities exist 
and require prompt attention. This review 
should be completed by June 30, 2004, and the 
results reported to FERC, NERC, and utility 
regulatory authorities in the affected states. 

2. Interim Voltage Criteria 

NERC: 

Required that FE, consistent with or as part of a 
study ordered by FERC on December 24, 
2003, 19 determine the minimum acceptable lo¬ 
cation-specific voltages at all 345 kV and 138 
kV buses and all generating stations within the 
FE control area (including merchant plants). 
Further, FE is to determine the minimum dy¬ 
namic reactive reserves that must be main¬ 
tained in local areas to ensure that these mini¬ 
mum voltages are met following contingencies 
studied in accordance with ECAR Document 
l. 20 Criteria and minimum voltage require¬ 
ments must comply with NERC planning crite¬ 
ria, including Table 1A, Category C3, and Oper¬ 
ating Policy 2. 21 

Task Force: 

Recommends that NERC appoint a team, 
joined by representatives from FERC and the 
Ohio Public Utility Commission, to review 
and approve all such criteria. 

3. FE Actions Based on FERC-Ordered Study 
NERC: 

Required that when the FERC-ordered study is 
completed, FE is to adopt the planning and op¬ 
erating criteria determined as a result of that 
study and update the operating criteria and 
procedures for its system operators. If the study 
indicates a need for system reinforcement, FE 
is to develop a plan for developing such re¬ 
sources as soon as practical and develop opera¬ 
tional procedures or other mitigating programs 
to maintain safe operating conditions until 
such time that the necessary system reinforce¬ 
ments can be made. 
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Task Force: 

Recommends that a team appointed by NERC 
and joined by representatives from FERC and 
the Ohio Public Utility Commission should re¬ 
view and approve this plan. 

4. Reactive Resources 
NERC: 

Required that FE inspect all reactive resources, 
including generators, and ensure that all are 
fully operational. FE is also required to verify 
that all installed capacitors have no blown 
fuses and that at least 98% of installed capaci¬ 
tors (69 kV and higher) are available for service 
during the summer of 2004. 

Task Force: 

Recommends that NERC also require FE to 
confirm that all non-utility generators in its 
area have entered into contracts for the sale of 
generation committing them to producing in¬ 
creased or maximum reactive power when 
called upon by FE or MISO to do so. Such con¬ 
tracts should ensure that the generator would 
be compensated for revenue losses associated 
with a reduction in real power sales in order 
to increase production of reactive power. 

5. Operational Preparedness and Action Plan 
NERC: 

Required that FE prepare and submit to ECAR 
an Operational Preparedness and Action Plan 
to ensure system security and full compliance 
with NERC and planning and operating crite¬ 
ria, including ECAR Document 1. 

Task Force: 

Recommends that NERC require copies of this 
plan to be provided to FERC, DOE, the Ohio 
Public Utility Commission, and the public 
utility commissions in other states in which 
FE operates. The Task Force also recommends 
that NERC require FE to invite its system oper¬ 
ations partners—control areas adjacent to FE, 
plus MISO, ECAR, and PJM—to participate in 
the development of the plan and agree to its 
implementation in all aspects that could affect 
their respective systems and operations. 

6. Emergency Response Resources 

NERC: 

Required that FE develop a capability to reduce 
load in the Cleveland-Akron area by 1500 MW 
within ten minutes of a directive to do so by 
MISO or the FE system operator. Such a 


capability may be provided by automatic or 
manual load shedding, voltage reduction, di¬ 
rect-controlled commercial or residential load 
management, or any other method or combina¬ 
tion of methods capable of achieving the 1500 
MW of reduction in ten minutes without ad¬ 
versely affecting other interconnected systems. 
The amount of required load reduction capabil¬ 
ity may be modified to an amount shown by the 
FERC-ordered study to be sufficient for re¬ 
sponse to severe contingencies and if approved 
by ECAR and NERC. 

Task Force: 

Recommends that NERC require MISO’s ap¬ 
proval of any change in the amount of re¬ 
quired load reduction capability. It also rec¬ 
ommends that NERC require FE’s load reduc¬ 
tion plan to be shared with the Ohio Public 
Utilities Commission and that FE should com¬ 
municate with all communities in the affected 
areas about the plan and its potential conse¬ 
quences. 

7. Emergency Response Plan 
NERC: 

Required that FE develop an emergency re¬ 
sponse plan, including arrangements for de¬ 
ploying the load reduction capabilities noted 
above. The plan is to include criteria for deter¬ 
mining the existence of an emergency and 
identify various possible states of emergency. 
The plan is to include detailed operating proce¬ 
dures and communication protocols with all 
the relevant entities including MISO, FE opera¬ 
tors, and market participants within the FE 
area that have an ability to vary generation out¬ 
put or shed load upon orders from FE opera¬ 
tors. The plan should include procedures for 
load restoration after the declaration that the 
FE system is no longer in an emergency operat¬ 
ing state. 

Task Force: 

Recommends that NERC require FE to offer its 
system operations partners—i.e., control ar¬ 
eas adjacent to FE, plus MISO, ECAR, and 
PJM—an opportunity to contribute to the de¬ 
velopment of the plan and agree to its key pro¬ 
visions. 

8. Operator Communications 
NERC: 

Required that FE develop communications pro¬ 
cedures for FE operating personnel to use 
within FE, with MISO and neighboring 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations 


151 


systems, and others. The procedure and the op¬ 
erating environment within the FE system con¬ 
trol center should allow control room staff to 
focus on reliable system operations and avoid 
distractions such as calls from customers and 
others who are not responsible for operation of 
a portion of the transmission system. 

Task Force: 

Recommends that NERC require these proce¬ 
dures to be shared with and agreed to by con¬ 
trol areas adjacent to FE, plus MISO, ECAR, 
and PJM, and any other affected system opera¬ 
tions partners, and that these procedures be 
tested in a joint drill. 

9. Reliability Monitoring and System Manage¬ 
ment Tools 

NERC: 

Required that FE ensure that its state estimator 
and real-time contingency analysis functions 
are used to execute reliably full contingency 
analyses automatically every ten minutes or on 
demand, and used to notify operators of poten¬ 
tial first contingency violations. 

Task Force: 

Recommends that NERC also require FE to en¬ 
sure that its information technology support 
function does not change the effectiveness of 
reliability monitoring or management tools in 
any way without the awareness and consent 
of its system operations staff. 

10. GE XA21 System Updates and Transition to 
New Energy Management System 

NERC: 

Required that until FE replaces its GE XA21 En¬ 
ergy Management System, FE should imple¬ 
ment all current known fixes for the GE XA21 
system necessary to ensure reliable and stable 
operation of critical reliability functions, and 
particularly to correct the alarm processor fail¬ 
ure that occurred on August 14, 2003. 

Task Force: 

Recommends that NERC require FE to design 
and test the transition to its planned new en¬ 
ergy management system to ensure that the 
system functions effectively, that the transi¬ 
tion is made smoothly, that the system’s oper¬ 
ators are adequately trained, and that all op¬ 
erating partners are aware of the transition. 


11. Emergency Preparedness Training for 
Operators 

NERC: 

Required that all reliability coordinators, con¬ 
trol areas, and transmission operators provide 
at least five days of training and drills using re¬ 
alistic simulation of system emergencies for 
each staff person with responsibility for the 
real-time operation or reliability monitoring of 
the bulk electric system. This system emer¬ 
gency training is in addition to other training 
requirements. The term “realistic simulation” 
includes a variety of tools and methods that 
present operating personnel with situations to 
improve and test diagnostic and decision¬ 
making skills in an environment that resembles 
expected conditions during a particular type of 
system emergency. 

Task Force: 

Recommends that to provide effective training 
before June 30, 2004, NERC should require FE 
to consider seeking the assistance of another 
control area or reliability coordinator known 
to have a quality training program (such as 
IMO or ISO-New England) to provide the 
needed training with appropriate FE-specific 
modifications. 

B. Corrective Actions to be Completed by MISO 
by June 30, 2004 

1. Reliability Tools 
NERC: 

Required that MISO fully implement and test 
its topology processor to provide its operating 
personnel a real-time view of the system status 
for all transmission lines operating and all gen¬ 
erating units within its system, and all critical 
transmission lines and generating units in 
neighboring systems. Alarms should be pro¬ 
vided for operators for all critical transmission 
line outages and voltage violations. MISO is to 
establish a means of exchanging outage infor¬ 
mation with its members and adjacent systems 
such that the MISO state estimator has accurate 
and timely information to perform as designed. 
MISO is to fully implement and test its state es¬ 
timation and real-time contingency analysis 
tools to ensure they can operate reliably no less 
than every ten minutes. MISO is to provide 
backup capability for all functions critical to 
reliability. 
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Task Force: 

Recommends that NERC require MISO to en¬ 
sure that its information technology support 
staff does not change the effectiveness of reli¬ 
ability monitoring or management tools in 
any way without the awareness and consent 
of its system operations staff. 

2. Operating Agreements 
NERC: 

Required that MISO reevaluate its operating 
agreements with member entities to verify its 
authority to address operating issues, includ¬ 
ing voltage and reactive management, voltage 
scheduling, the deployment and redispatch of 
real and reactive reserves for emergency re¬ 
sponse, and the authority to direct actions dur¬ 
ing system emergencies, including shedding 
load. 

Task Force: 

Recommends that NERC require that any 
problems or concerns related to these operat¬ 
ing issues be raised promptly with FERC and 
MISO’s members for resolution. 

C. Corrective Actions to be Completed by PJM 
by June 30, 2004 

NERC: 

Required that PJM reevaluate and improve its 
communications protocols and procedures be¬ 
tween PJM and its neighboring control areas 
and reliability coordinators. 

Task Force: 

Recommends that NERC require definitions 
and usages of key terms be standardized, and 
non-essential communications be minimized 
during disturbances, alerts, or emergencies. 
NERC should also require PJM, MISO, and 
their member companies to conduct one or 
more joint drills using the new communica¬ 
tions procedures. 

D. Task Force Recommendations for Corrective 
Actions to be Completed by ECAR by August 
14, 2004 

1. Modeling and Assessments 
Task Force: 

Recommends that NERC require ECAR to re¬ 
evaluate its modeling procedures, assump¬ 
tions, scenarios and data for seasonal assess¬ 
ments and extreme conditions evaluations. 


ECAR should consult with an expert team ap¬ 
pointed by NERC—joined by representatives 
from FERC, DOE, interested state commis¬ 
sions, and MISO—to develop better modeling 
procedures and scenarios, and obtain review 
of future assessments by the expert team. 

2. Verification of Data and Assumptions 

Task Force: 

Recommends that NERC require ECAR to re¬ 
examine and validate all data and model as¬ 
sumptions against current physical asset ca¬ 
pabilities and match modeled assets (such as 
line characteristics and ratings, and generator 
reactive power output capabilities) to current 
operating study assessments. 

3. Ensure Consistency of Members’ Data 

Task Force: 

Recommends that NERC require ECAR to con¬ 
duct a data validation and exchange exercise 
to be sure that its members are using accurate, 
consistent, and current physical asset charac¬ 
teristics and capabilities for both long-term 
and seasonal assessments and operating stud¬ 
ies. 

E. Task Force Recommendation for Corrective 
Actions to be Completed by Other Parties by 
June 30, 2004 

Task Force: 

Recommends that NERC require each North 
American reliability coordinator, reliability 
council, control area, and transmission com¬ 
pany not directly addressed above to review 
the actions required above and determine 
whether it has adequate system facilities, op¬ 
erational procedures, tools, and training to 
ensure reliable operations for the summer of 
2004. If any entity finds that improvements 
are needed, it should immediately undertake 
the needed improvements, and coordinate 
them with its neighbors and partners as neces¬ 
sary. 

The Task Force also recommends that FERC 
and government agencies in Canada require 
all entities under their jurisdiction who are 
users of GE/Harris XA21 Energy Management 
Systems to consult the vendor and ensure that 
appropriate actions have been taken to avert 
any recurrence of the malfunction that oc¬ 
curred on FE’s system on August 14. 
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16. Establish enforceable standards for 
maintenance of electrical clearances in 
right-of-way areas. 22 

On February 10, the NERC Board directed the 
NERC Compliance Program and the regional coun¬ 
cils to initiate a joint program for reporting all 
bulk electric system transmission line trips result¬ 
ing from vegetation contact. Based on the results of 
these filings, NERC is to consider the development 
of minimum line clearance standards to ensure 
reliability. 

The Task Force believes that more aggressive 
action is warranted. NERC should work with 
FERC, appropriate authorities in Canada, state reg¬ 
ulatory agencies, the Institute of Electrical and 
Electronic Engineers (IEEE), utility arborists, and 
other experts from the US and Canada to develop 
clear, unambiguous standards pertaining to main¬ 
tenance of safe clearances of transmission lines 
from obstructions in the lines’ right-of-way areas, 
and to develop a mechanism to verify compliance 
with the standards and impose penalties for non- 
compliance. 


Ineffective vegetation management was a major 
cause of the August 14, 2003, blackout and it was 
also a causal factor in other large-scale North 
American outages such as those that occurred in 
the summer of 1996 in the western United States. 
Maintaining transmission line rights-of-way, 
including maintaining safe clearances of ener¬ 
gized lines from vegetation, man-made structures, 
bird nests, etc., requires substantial expenditures 
in many areas of North America. However, such 
maintenance is a critical investment for ensuring a 
reliable electric system. For a review of current 
issues pertaining to utility vegetation manage¬ 
ment programs, see Utility Vegetation Manage¬ 
ment Final Report, March 2004. 23 

NERC does not presently have standards for 
right-of-way maintenance. However, it has stan¬ 
dards requiring that line ratings be set to maintain 
safe clearances from all obstructions. Line rating 
standards should be reviewed to ensure that they 
are sufficiently clear and explicit. In the United 
States, National Electrical Safety Code (NESC) 
rules specify safety clearances required for over¬ 
head conductors from grounded objects and other 
types of obstructions, but those rules are subject to 
broad interpretation. Several states have adopted 
their own electrical safety codes and similar codes 
apply in Canada and its provinces. A mechanism 
is needed to verify compliance with these require¬ 
ments and to penalize noncompliance. 


A. Enforceable Standards 

NERC should work with FERC, government agen¬ 
cies in Canada, state regulatory agencies, the Insti¬ 
tute of Electrical and Electronic Engineers (IEEE), 
utility arborists, and other experts from the U.S. 
and Canada to develop clear, unambiguous stan¬ 
dards pertaining to maintenance of safe clearances 
of transmission lines from obstructions in the 
lines’ right-of-way areas, and procedures to verify 
compliance with the standards. States, provinces, 
and local governments should remain free to set 
more specific or higher standards as they deem 
necessary for their respective areas. 

B. Right-of-Way Management Plan 

NERC should require each bulk electric transmis¬ 
sion operator to publish annually a proposed 
right-of-way management plan on its public 
website, and a report on its right-of-way manage¬ 
ment activities for the previous year. The manage¬ 
ment plan should include the planned frequency 
of actions such as right-of-way trimming, herbi¬ 
cide treatment, and inspections, and the report 
should give the dates when the rights-of-way in a 
given district were last inspected and corrective 
actions taken. 

C. Requirement to Report Outages Due to 
Ground Faults in Right-of-Way Areas 

Beginning with an effective date of March 31, 
2004, NERC should require each transmission 
owner/operator to submit quarterly reports of all 
ground-fault line trips, including their causes, on 
lines of 115 kV and higher in its footprint to the 
regional councils. Failure to report such trips 
should lead to an appropriate penalty. Each 
regional council should assemble a detailed 
annual report on ground fault line trips and their 
causes in its area to FERC, NERC, DOE, appropri¬ 
ate authorities in Canada, and state regulators no 
later than March 31 for the preceding year, with 
the first annual report to be filed in March 2005 for 
calendar year 2004. 

D. Transmission-Related Vegetation Manage¬ 
ment Expenses, if Prudently Incurred, 

Should be Recoverable through Electric 
Rates 

The level of activity in vegetation management 
programs in many utilities and states has fluctu¬ 
ated widely from year to year, due in part to incon¬ 
sistent funding and varying management support. 
Utility managers and regulators should recognize 
the importance of effective vegetation manage¬ 
ment to transmission system reliability, and that 
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changes in vegetation management may be needed 
in response to weather, insect infestations, and 
other factors. Transmission vegetation manage¬ 
ment programs should be consistently funded and 
proactively managed to maintain and improve 
system reliability. 


17. Strengthen the NERC Compliance 
Enforcement Program. 

On February 10, 2004, the NERC Board of Trustees 
approved directives to the regional reliability 
councils that will significantly strengthen NERC’s 
existing Compliance Enforcement Program. The 
Task Force supports these directives strongly, and 
recommends certain additional actions, as 
described below. 24 

A. Reporting of Violations 
NERC: 

Requires each regional council to report to the 
NERC Compliance Enforcement Program 
within one month of occurrence all “significant 
violations” of NERC operating policies and 
planning standards and regional standards, 
whether verified or still under investigation by 
the regional council. (A “significant violation” 
is one that could directly reduce the integrity of 
the interconnected power systems or otherwise 
cause unfavorable risk to the interconnected 
power systems.) In addition, each regional 
council is to report quarterly to NERC, in a for¬ 
mat prescribed by NERC, all violations of 
NERC and regional reliability standards. 

Task Force: 

Recommends that NERC require the regional 
councils’ quarterly reports and reports on sig¬ 
nificant violations be filed as public docu¬ 
ments with FERC and appropriate authorities 
in Canada, at the same time that they are sent 
to NERC. 

B. Enforcement Action by NERC Board 
NERC: 

After being presented with the results of the in¬ 
vestigation of a significant violation, the Board 
is to require an offending organization to cor¬ 
rect the violation within a specified time. If the 
Board determines that the organization is 
non-responsive and continues to cause a risk to 
the reliability of the interconnected power sys¬ 
tems, the Board will seek to remedy the viola¬ 
tion by requesting assistance from appropriate 


regulatory authorities in the United States and 
Canada. 

Task Force: 

Recommends that NERC inform the federal 
and state or provincial authorities of both 
countries of the final results of all enforce¬ 
ment proceedings, and make the results of 
such proceedings public. 

C. Violations in August 14, 2003 Blackout 

NERC: 

The Compliance and Standards investigation 
team will issue a final report in March or April 
of 2004 of violations of NERC and regional 
standards that occurred on August 14. (Seven 
violations are noted in this report (pages 19- 
20), but additional violations may be identified 
by NERC.) Within three months of the issuance 
of the report, NERC is to develop recommenda¬ 
tions to improve the compliance process. 

Task Force: 

Recommends that NERC make its recommen¬ 
dations available to appropriate U.S. federal 
and state authorities, to appropriate authori¬ 
ties in Canada, and to the public. 

D. Compliance Audits 
NERC: 

Established plans for two types of audits, com¬ 
pliance audits and readiness audits. Compli¬ 
ance audits would determine whether the sub¬ 
ject entity is in documented compliance with 
NERC standards, policies, etc. Readiness au¬ 
dits focus on whether the entity is functionally 
capable of meeting the terms of its reliability re¬ 
sponsibilities. Under the terms approved by 
NERC’s Board, the readiness audits to be com¬ 
pleted by June 30, 2004, will be conducted us¬ 
ing existing NERC rules, policies, standards, 
and NERC compliance templates. Require¬ 
ments for control areas will be based on the ex¬ 
isting NERC Control Area Certification Proce¬ 
dure, and updated as new criteria are ap¬ 
proved. 

Task Force: 

Supports the NERC effort to verify that all 
entities are compliant with reliability stan¬ 
dards. Effective compliance and auditing will 
require that the NERC standards be im¬ 
proved rapidly to make them clear, unambig¬ 
uous, measurable, and consistent with the 
Functional Model. 
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E. Audit Standards and Composition of Audit 
Teams 

NERC: 

Under the terms approved by the Board, the re¬ 
gional councils are to have primary responsi¬ 
bility for conducting the compliance audits, 
under the oversight and direct participation of 
staff from the NERC Compliance Enforcement 
Program. FERC and other relevant regulatory 
agencies will be invited to participate in the au¬ 
dits, subject to the same confidentiality condi¬ 
tions as the other team members. 

Task Force: 

Recommends that each team should have 
some members who are electric reliability ex¬ 
perts from outside the region in which the au¬ 
dit is occurring. Also, some team members 
should be from outside the electricity indus¬ 
try, i.e., individuals with experience in sys¬ 
tems engineering and management, such as 
persons from the nuclear power industry, the 
U.S. Navy, the aerospace industry, air traffic 
control, or other relevant industries or gov¬ 
ernment agencies. To improve the objectivity 
and consistency of investigation and perfor¬ 
mance, NERC-organized teams should con¬ 
duct these compliance audits, using NERC cri¬ 
teria (with regional variations if more strin¬ 
gent), as opposed to the regional councils us¬ 
ing regionally developed criteria. 

F. Public Release of Compliance Audit Reports 

Task Force: 

Recommends that NERC require all compli¬ 
ance audit reports to be publicly posted, ex¬ 
cluding portions pertaining to physical and 
cyber security according to predetermined 
criteria. Such reports should draw clear dis¬ 
tinctions between serious and minor viola¬ 
tions of reliability standards or related re¬ 
quirements. 


18. Support and strengthen NERC’s Reli¬ 
ability Readiness Audit Program. 25 

On February 10, 2004, the NERC Board of Trustees 
approved the establishment of a NERC program for 
periodic reviews of the reliability readiness of all 
reliability coordinators and control areas. The 
Task Force strongly supports this action, and rec¬ 
ommends certain additional measures, as 
described below. 


A. Readiness Audits 
NERC: 

In its directives of February 10, 2004, NERC in¬ 
dicated that it and the regional councils would 
jointly establish a program to audit the reliabil¬ 
ity readiness of all reliability coordinators and 
control areas within three years and continuing 
thereafter on a three-year cycle. Twenty audits 
of high-priority areas will be completed by June 
30, 2004, with particular attention to deficien¬ 
cies identified in the investigation of the Au¬ 
gust 14 blackout. 

Task Force: 

Recommends that the remainder of the first 
round of audits be completed within two 
years, as compared to NERC’s plan for three 
years. 

B. Public Release of Readiness Audit Reports 
Task Force: 

Recommends that NERC require all readiness 
audit reports to be publicly posted, excluding 
portions pertaining to physical and cyber se¬ 
curity. Reports should also be sent directly to 
DOE, FERC, and relevant authorities in Can¬ 
ada and state commissions. Such reports 
should draw clear distinctions between seri¬ 
ous and minor violations of reliability stan¬ 
dards or related requirements. 


19. Improve near-term and long-term 
training and certification requirements 
for operators, reliability coordinators, 
and operator support staff. 26 

In its requirements of February 10, 2004, NERC 
directed that all reliability coordinators, control 
areas, and transmission operators are to provide at 
least five days per year of training and drills in 
system emergencies, using realistic simulations, for 
each staff person with responsibility for the 
real-time operation or reliability monitoring of the 
bulk electric system. This system emergency train¬ 
ing is in addition to other training requirements. 
Five days of system emergency training and drills 
are to be completed by June 30, 2004. 

The Task Force supports these near-term require¬ 
ments strongly. For the long term, the Task Force 
recommends that: 

A. NERC should require training for the planning 
staff at control areas and reliability coordina¬ 
tors concerning power system characteristics 
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and load, VAr, and voltage limits, to enable 
them to develop rules for operating staff to fol¬ 
low. 

B. NERC should require control areas and reliabil¬ 
ity coordinators to train grid operators, IT sup¬ 
port personnel, and their supervisors to 
recognize and respond to abnormal automation 
system activity. 

C. NERC should commission an advisory report by 
an independent panel to address a wide range 
of issues concerning reliability training pro¬ 
grams and certification requirements. 


The Task Force investigation team found that 
some reliability coordinators and control area 
operators had not received adequate training in 
recognizing and responding to system emergen¬ 
cies. Most notable was the lack of realistic simula¬ 
tions and drills to train and verify the capabilities 
of operating personnel. Such simulations are 
essential if operators and other staff are to be able 
to respond adequately to emergencies. This train¬ 
ing deficiency contributed to the lack of situa¬ 
tional awareness and failure to declare an 
emergency on August 14 while operator interven¬ 
tion was still possible (before events began to 
occur at a speed beyond human control). 

Control rooms must remain functional under a 
wide range of possible conditions. Any person 
with access to a control room should be trained so 
that he or she understands the basic functions of 
the control room, and his or her role in relation to 
those of others in the room under any conditions. 
Information technology (IT) staff, in particular, 
should have a detailed understanding of the infor¬ 
mation needs of the system operators under alter¬ 
native conditions. 

The Task Force’s cyber investigation team noted 
in its site visits an increasing reliance by control 
areas and utilities on automated systems to mea¬ 
sure, report on, and change a wide variety of phys¬ 
ical processes associated with utility operations. 27 
If anything, this trend is likely to intensify in the 
future. These systems enable the achievement of 
major operational efficiencies, but their failure 
could cause or contribute to blackouts, as evi¬ 
denced by the alarm failures at FirstEnergy and 
the state estimator deactivation at MISO. 

Grid operators should be trained to recognize and 
respond more efficiently to security and automa¬ 
tion problems, reinforced through the use of peri¬ 
odic exercises. Likewise, IT support personnel 
should be better trained to understand and 
respond to the requirements of grid operators dur¬ 
ing security and IT incidents. 


NERC’s near-term requirements for emergency 
preparedness training are described above. For the 
long term, training for system emergencies should 
be fully integrated into the broader training pro¬ 
grams required for all system planners, system 
operators, their supervisors, and other control 
room support staff. 

Advisory Report by Independent Panel on 
Industry Training Programs and Certification 
Requirements 

Under the oversight of FERC and appropriate 
Canadian authorities, the Task Force recommends 
that NERC commission an independent advisory 
panel of experts to design and propose minimum 
training programs and certification procedures for 
the industry’s control room managers and staff. 
This panel should be comprised of experts from 
electric industry organizations with outstanding 
training programs, universities, and other indus¬ 
tries that operate large safety or reliability- 
oriented systems and training programs. (The 
Institute of Nuclear Power Operations (INPO), for 
example, provides training and other safety- 
related services to operators of U.S. nuclear power 
plants and plants in other countries.) The panel’s 
report should provide guidance on issues such as: 

1. Content of programs for new trainees 

2. Content of programs for existing operators and 
other categories of employees 

3. Content of continuing education programs and 
fraction of employee time to be committed to 
ongoing training 

4. Going beyond paper-based, fact-oriented 
“knowledge” requirements for operators—i.e., 
confirming that an individual has the ability to 
cope with unforeseen situations and 
emergencies 

5. In-house training vs. training by independent 
parties 

6. Periodic accreditation of training programs 

7. Who should certify trained staff? 

8. Criteria to establish grades or levels of operator 
qualifications from entry level to supervisor or 
manager, based on education, training, and 
experience. 

The panel’s report should be delivered by March 
31, 2005. FERC and Canadian authorities, in con¬ 
sultation with NERC and others, should evaluate 
the report and consider its findings in setting 
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minimum training and certification requirements 
for control areas and reliability coordinators. 


20. Establish clear definitions for normal, 
alert and emergency operational sys¬ 
tem conditions. Clarify roles, responsi¬ 
bilities, and authorities of reliability 
coordinators and control areas under 
each condition. 28 

NERC should develop by June 30, 2004 definitions 
for normal, alert, and emergency system condi¬ 
tions, and clarify reliability coordinator and con¬ 
trol area functions, responsibilities, required 
capabilities, and required authorities under each 
operational system condition. _ 

System operators need common definitions for 
normal, alert, and emergency conditions to enable 
them to act appropriately and predictably as sys¬ 
tem conditions change. On August 14, the princi¬ 
pal entities involved in the blackout did not have a 
shared understanding of whether the grid was in 
an emergency condition, nor did they have a com¬ 
mon understanding of the functions, responsibili¬ 
ties, capabilities, and authorities of reliability 
coordinators and control areas under emergency 
or near-emergency conditions. 

NERC: 

On February 10, 2004, NERC’s Board of 
Trustees directed NERC’s Operating Commit¬ 
tee to “clarify reliability coordinator and con¬ 
trol area functions, responsibilities, capabili¬ 
ties, and authorities” by June 30, 2004. 

Task Force: 

Recommends that NERC go further and de¬ 
velop clear definitions of three operating sys¬ 
tem conditions, along with clear statements of 
the roles and responsibilities of all partici¬ 
pants, to ensure effective and timely actions in 
critical situations. 

Designating three alternative system conditions 
(normal, alert, and emergency) would help grid 
managers to avert and deal with emergencies 
through preventive action. Many difficult situa¬ 
tions are avoidable through strict adherence to 
sound procedures during normal operations. 
However, unanticipated difficulties short of an 
emergency still arise, and they must be addressed 
swiftly and skillfully to prevent them from becom¬ 
ing emergencies. Doing so requires a high level of 
situational awareness that is difficult to sustain 
indefinitely, so an intermediate “alert” state is 


needed, between “normal” and “emergency.” In 
some areas (e.g., NPCC) an “alert” state has already 
been established. 


21. Make more effective and wider use of 
system protection measures. 29 

In its requirements of February 10, 2004, NERC: 

A. Directed all transmission owners to evaluate 
the settings of zone 3 relays on all transmission 
lines of 230 kV and higher. 

B. Directed all regional councils to evaluate the 
feasibility and benefits of installing 
under-voltage load shedding capability in load 
centers. 

C. Called for an evaluation within one year of its 
planning standard on system protection and 
control to take into account the lessons from the 
August 14 blackout. 

The Task Force supports these actions strongly, 

and recommends certain additional measures, as 

described below. _ 

A. Evaluation of Zone 3 Relays 

NERC: 

Industry is to review zone 3 relays on lines of 
230 kV and higher. 

Task Force: 

Recommends that NERC broaden the review 
to include operationally significant 115 kV 
and 138 kV lines, e.g., lines that are part of 
monitored flowgates or interfaces. Transmis¬ 
sion owners should also look for zone 2 relays 
set to operate like zone 3s. 

B. Evaluation of Applicability of Under-Voltage 
Load Shedding 

NERC: 

Required each regional reliability council to 
evaluate the feasibility and benefits of un¬ 
der-voltage load shedding (UVLS) capability in 
load centers that could become unstable as a re¬ 
sult of insufficient reactive power following 
credible multiple-contingency events. The re¬ 
gions should complete the initial studies and 
report the results to NERC within one year. The 
regions should promote the installation of un¬ 
der-voltage load shedding capabilities within 
critical areas where beneficial, as determined 
by the studies to be effective in preventing or 
containing an uncontrolled cascade of the 
power system. 
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year the real-time operating tools necessary for 
reliability operation and reliability coordination, 
including backup capabilities. The committee’s 
report is to address both minimum acceptable 
capabilities for critical reliability functions and a 
guide to best practices. 

The Task Force supports these requirements 
strongly. It recommends that NERC require the 
committee to: 

A. Give particular attention in its report to the 
development of guidance to control areas and 
reliability coordinators on the use of automated 
wide-area situation visualization display sys¬ 
tems and the integrity of data used in those sys¬ 
tems. 

B. Prepare its report in consultation with FERC, 

appropriate authorities in Canada, DOE, and 
the regional councils. The report should also 
inform actions by FERC and Canadian 
government agencies to establish minimum 
functional requirements for control area opera¬ 
tors and reliability coordinators. 

The Task Force also recommends that FERC, DHS, 
and appropriate authorities in Canada should 
require annual independent testing and certifica¬ 
tion of industry EMS and SCADA systems to ensure 
that they meet the minimum requirements envi¬ 
sioned in Recommendation 3. 


Task Force: 

Recommends that NERC require the results of 
the regional studies to be provided to federal 
and state or provincial regulators at the same 
time that they are reported to NERC. In addi¬ 
tion, NERC should require every entity with a 
new or existing UVLS program to have a 
well-documented set of guidelines for opera¬ 
tors that specify the conditions and triggers for 
UVLS use. 

C. Evaluation of NERC’s Planning Standard III 
NERC: 

Plans to evaluate Planning Standard III, System 
Protection and Control, and propose, by March 
1, 2005, specific revisions to the criteria to ad¬ 
dress adequately the issue of slowing or limit¬ 
ing the propagation of a cascading failure, in 
light of the experience gained on August 14. 

Task Force: 

Recommends that NERC, as part of the review 
of Planning Standard III, determine the goals 
and principles needed to establish an inte¬ 
grated approach to relay protection for gener¬ 
ators and transmission lines and the use of un¬ 
der-frequency and under-voltage load shed¬ 
ding (UFLS and UVLS) programs. An inte¬ 
grated approach is needed to ensure that at the 
local and regional level these interactive com¬ 
ponents provide an appropriate balance of 
risks and benefits in terms of protecting spe¬ 
cific assets and facilitating overall grid sur¬ 
vival. This review should take into account 
the evidence from August 14 of some unin¬ 
tended consequences of installing Zone 3 re¬ 
lays and using manufacturer-recommended 
settings for relays protecting generators. It 
should also include an assessment of the ap¬ 
propriate role and scope of UFLS and UVLS, 
and the appropriate use of time delays in re¬ 
lays. 

Recommends that in this effort NERC should 
work with industry and government research 
organizations to assess the applicability of ex¬ 
isting and new technology to make the inter¬ 
connections less susceptible to cascading out¬ 
ages. 


A principal cause of the August 14 blackout was a 
lack of situational awareness, which was in turn 
the result of inadequate reliability tools and 
backup capabilities. In addition, the failure of FE’s 
control computers and alarm system contributed 
directly to the lack of situational awareness. Like¬ 
wise, MISO’s incomplete tool set and the failure to 
supply its state estimator with correct system data 
on August 14 contributed to the lack of situational 
awareness. The need for improved visualization 
capabilities over a wide geographic area has been a 
recurrent theme in blackout investigations. Some 
wide-area tools to aid situational awareness (e.g., 
real-time phasor measurement systems) have been 
tested in some regions but are not yet in general 
use. Improvements in this area will require signifi¬ 
cant new investments involving existing or emerg¬ 
ing technologies. 

The investigation of the August 14 blackout 
revealed that there has been no consistent means 
across the Eastern Interconnection to provide an 
understanding of the status of the power grid out¬ 
side of a control area. Improved visibility of the 
status of the grid beyond an operator’s own area of 
control would aid the operator in making adjust¬ 
ments in its operations to mitigate potential 


22. Evaluate and adopt better real-time 
tools for operators and reliability coor¬ 
dinators. 30 

NERC’s requirements of February 10, 2004, direct 
its Operating Committee to evaluate within one 
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problems. The expanded view advocated above 
would also enable facilities to be more proactive in 
operations and contingency planning. 

Annual testing and certification by independent, 
qualified parties is needed because EMS and 
SCADA systems are the nerve centers of bulk elec¬ 
tric networks. Ensuring that these systems are 
functioning properly is critical to sound and reli¬ 
able operation of the networks. 


23. Strengthen reactive power and voltage 
control practices in all NERC regions. 31 

NERC’s requirements of February 10, 2004 call for 
a reevaluation within one year of existing reactive 
power and voltage control standards and how they 
are being implemented in the ten NERC regions. 
However, by June 30, 2004, ECAR is required to 
review its reactive power and voltage criteria and 
procedures, verify that its criteria and procedures 
are being fully implemented in regional and mem¬ 
ber studies and operations, and report the results 
to the NERC Board. 

The Task Force supports these requirements 
strongly. It recommends that NERC require the 
regional analyses to include recommendations for 
appropriate improvements in operations or facili¬ 
ties, and to be subject to rigorous peer review by 
experts from within and outside the affected areas. 

The Task Force also recommends that FERC and 
appropriate authorities in Canada require all tar¬ 
iffs or contracts for the sale of generation to 
include provisions specifying that the generators 
can be called upon to provide or increase reactive 
power output if needed for reliability purposes, 
and that the generators will be paid for any lost 
revenues associated with a reduction of real power 
sales attributable to a required increase in the pro¬ 
duction of reactive power. 

Reactive power problems were a significant factor 
in the August 14 outage, and they were also impor¬ 
tant elements in several of the earlier outages 
detailed in Chapter 7. 32 Accordingly, the Task 
Force agrees that a comprehensive review is 
needed of North American practices with respect 
to managing reactive power requirements and 
maintaining an appropriate balance among alter¬ 
native types of reactive resources. 

Regional Analyses, Peer Reviews, and Follow- 
Up Actions 

The Task Force recommends that each regional 
reliability council, working with reliability coor¬ 
dinators and the control areas serving major load 
centers, should conduct a rigorous reliability and 


adequacy analysis comparable to that outlined in 
FERC’s December 24, 2003, Order 33 to FirstEnergy 
concerning the Cleveland-Akron area. The Task 
Force recommends that NERC develop a priori¬ 
tized list for which areas and loads need this type 
of analysis and a schedule that ensures that the 
analysis will be completed for all such load cen¬ 
ters by December 31, 2005. 


24. Improve quality of system modeling 
data and data exchange practices. 34 

NERC’s requirements of February 10, 2004 direct 
that within one year the regional councils are to 
establish and begin implementing criteria and pro¬ 
cedures for validating data used in power flow 

models and dynamic simulations by benchmarking 
model data with actual system performance. Vali¬ 
dated modeling data shall be exchanged on an 
inter-regional basis as needed for reliable system 
planning and operation. 

The Task Force supports these requirements 
strongly. The Task Force also recommends that 
FERC and appropriate authorities in Canada 
require all generators, regardless of ownership, to 
collect and submit generator data to NERC, using a 
regulator-approved template. 

The after-the-fact models developed to simulate 
August 14 conditions and events found that the 
dynamic modeling assumptions for generator and 
load power factors in regional planning and oper¬ 
ating models were frequently inaccurate. In par¬ 
ticular, the assumptions of load power factor were 
overly optimistic—loads were absorbing much 
more reactive power than the pre-August 14 mod¬ 
els indicated. Another suspected problem con¬ 
cerns modeling of shunt capacitors under 
depressed voltage conditions. 

NERC should work with the regional reliability 
councils to establish regional power system mod¬ 
els that enable the sharing of consistent and vali¬ 
dated data among entities in the region. Power 
flow and transient stability simulations should be 
periodically benchmarked with actual system 

events to validate model data. Viable load (includ¬ 
ing load power factor) and generator testing pro¬ 
grams are necessary to improve agreement 
between power flows and dynamic simulations 
and the actual system performance. 

During the data collection phase of the blackout 
investigation, when control areas were asked for 
information pertaining to merchant generation 
within their area, the requested data was 
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frequently not available because the control area 
had not recorded the status or output of the gener¬ 
ator at a given point in time. Some control area 
operators also asserted that some of the data that 
did exist was commercially sensitive or confiden¬ 
tial. To correct such problems, the Task Force rec¬ 
ommends that FERC and authorities in Canada 
require all generators, regardless of ownership, to 
collect and submit generator data, according to a 
regulator-approved template. 


25. NERC should reevaluate its existing 
reliability standards development pro¬ 
cess and accelerate the adoption of 
enforceable standards. 35 

The Task Force recommends that, with support 
from FERC and appropriate authorities in Canada, 
NERC should: 

A. Re-examine its existing body of standards, 
guidelines, etc., to identify those that are most 
important and ensure that all concerns that 
merit standards are addressed in the plan for 
standards development. 

B. Re-examine the plan to ensure that those that 
are the most important or the most out-of-date 
are addressed early in the process. 

C. Build on existing provisions and focus on what 
needs improvement, and incorporate compli¬ 
ance and readiness considerations into the 
drafting process. 

D. Re-examine the Standards Authorization 

Request process to determine whether, for each 
standard, a review and modification of an exist¬ 
ing standard would be more efficient than 
development of wholly new text for the stan- 
dard. _ 

NERC has already begun a long-term, systematic 
process to reevaluate its standards. It is of the 
greatest importance, however, that this process 
not dilute the content of the existing standards, 
nor conflict with the right of regions or other areas 
to impose more stringent standards. The state of 
New York, for example, operates under mandatory 
and more stringent reliability rules and standards 
than those required by NERC and NPCC. 36 

Similarly, several commenters on the Interim 
Report wrote jointly that: 

NERC standards are the minimum—national 
standards should always be minimum rather 
than absolute or “one size fits all” criteria. [Sys¬ 
tems for] densely populated areas, like the 
metropolitan areas of New York, Chicago, or 


Washington, must be designed and operated in 
accordance with a higher level of reliability than 
would be appropriate for sparsely populated 
parts of the country. It is essential that regional 
differences in terms of load and population den¬ 
sity be recognized in the application of planning 
and operating criteria. Any move to adopt a 
national, “one size fits all” formula for all parts 
of the United States would be disastrous to 
reliability .... 

A strong transmission system designed and oper¬ 
ated in accordance with weakened criteria 
would be disastrous. Instead, a concerted effort 
should be undertaken to determine if existing 
reliability criteria should be strengthened. Such 
an effort would recognize the geo-electrical mag¬ 
nitude of today’s interconnected networks, and 
the increased complexities deregulation and 
restructuring have introduced in planning and 
operating North American power systems. Most 
important, reliability should be considered a 
higher priority than commercial use. Only 
through strong standards and careful engineer¬ 
ing can unacceptable power failures like the 
August 14 blackout be avoided in the future. 37 


26. Tighten communications protocols, 
especially for communications during 
alerts and emergencies. Upgrade com¬ 
munication system hardware where 
appropriate. 38 

NERC should work with reliability coordinators 
and control area operators to improve the effective¬ 
ness of internal and external communications dur¬ 
ing alerts, emergencies, or other critical situations, 
and ensure that all key parties, including state and 
local officials, receive timely and accurate infor¬ 
mation. NERC should task the regional councils to 
work together to develop communications proto¬ 
cols by December 31, 2004, and to assess and 
report on the adequacy of emergency communica¬ 
tions systems within their regions against the pro¬ 
tocols by that date. 

On August 14, 2003, reliability coordinator and 
control area communications regarding condi¬ 
tions in northeastern Ohio were in some cases 
ineffective, unprofessional, and confusing. Inef¬ 
fective communications contributed to a lack of 
situational awareness and precluded effective 
actions to prevent the cascade. Consistent applica¬ 
tion of effective communications protocols, par¬ 
ticularly during alerts and emergencies, is 
essential to reliability. Standing hotline networks, 
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investigate future system disturbances, outages, 
or blackouts. 

B. NERC, reliability coordinators, control areas, 
and transmission owners should determine 
where high speed power system disturbance 
recorders are needed on the system, and ensure 
that they are installed by December 31, 2004. 

C. NERC should establish data recording proto¬ 
cols. 

D. FERC and appropriate authorities in Canada 
should ensure that the investments called for in 
this recommendation will be recoverable 
through transmission rates. 


or a functional equivalent, should be established 
for use in alerts and emergencies (as opposed to 
one-on-one phone calls) to ensure that all key par¬ 
ties are able to give and receive timely and accu¬ 
rate information. 


27. Develop enforceable standards for 
transmission line ratings. 39 

NERC should develop clear, unambiguous require¬ 
ments for the calculation of transmission line 

ratings (including dynamic ratings), and require 
that all lines of 115 kV or higher be rerated accord¬ 
ing to these requirements by June 30, 2005. 

As seen on August 14, inadequate vegetation man¬ 
agement can lead to the loss of transmission lines 
that are not overloaded, at least not according to 
their rated limits. The investigation of the black¬ 
out, however, also found that even after allowing 
for regional or geographic differences, there is still 
significant variation in how the ratings of existing 
lines have been calculated. This variation—in 
terms of assumed ambient temperatures, wind 
speeds, conductor strength, and the purposes and 
duration of normal, seasonal, and emergency rat¬ 
ings—makes the ratings themselves unclear, 
inconsistent, and unreliable across a region or 
between regions. This situation creates unneces¬ 
sary and unacceptable uncertainties about the safe 
carrying capacity of individual lines on the trans¬ 
mission networks. Further, the appropriate use of 
dynamic line ratings needs to be included in this 
review because adjusting a line’s rating according 
to changes in ambient conditions may enable the 
line to carry a larger load while still meeting safety 
requirements. 


28. Require use of time-synchronized data 
recorders. 40 

In its requirements of February 10, 2004, NERC 
directed the regional councils to define within one 
year regional criteria for the application of syn¬ 
chronized recording devices in key power plants 
and substations. 

The Task Force supports the intent of this require¬ 
ment strongly, but it recommends a broader 
approach: 

A. FERC and appropriate authorities in Canada 
should require the use of data recorders syn¬ 
chronized by signals from the Global Posi¬ 
tioning System (GPS) on all categories of 
facilities whose data may be needed to 


A valuable lesson from the August 14 blackout is 
the importance of having time-synchronized sys¬ 
tem data recorders. The Task Force’s investigators 
labored over thousands of data items to determine 
the sequence of events, much like putting together 
small pieces of a very large puzzle. That process 
would have been significantly faster and easier if 
there had been wider use of synchronized data 
recording devices. 

NERC Planning Standard I.F, Disturbance Moni¬ 
toring, requires the use of recording devices for 
disturbance analysis. On August 14, time record¬ 
ers were frequently used but not synchronized to a 
time standard. Today, at a relatively modest cost, 
all digital fault recorders, digital event recorders, 
and power system disturbance recorders can and 
should be time-stamped at the point of observa¬ 
tion using a Global Positioning System (GPS) 
synchronizing signal. (The GPS signals are syn¬ 
chronized with the atomic clock maintained in 
Boulder, Colorado by the U.S. National Institute of 
Standards and Technology.) Recording and time- 
synchronization equipment should be monitored 
and calibrated to assure accuracy and reliability. 

It is also important that data from automation sys¬ 
tems be retained at least for some minimum 
period, so that if necessary it can be archived to 
enable adequate analysis of events of particular 
interest. 


29. Evaluate and disseminate lessons 
learned during system restoration. 41 

In the requirements it issued on February 10, 2004, 
NERC directed its Planning Committee to work 
with the Operating Committee, NPCC, ECAR, and 
PJM to evaluate the black start and system restora¬ 
tion performance following the outage of August 
14, and to report within one year the results of that 
evaluation, with recommendations for 
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improvement. Within six months of the Planning 
Committee’s report, all regional councils are to 
have reevaluated their plans and procedures to 
ensure an effective black start and restoration 
capability within their region. 

The Task Force supports these requirements 
strongly. In addition, the Task Force recommends 
that NERC should require the Planning Commit¬ 
tee’s review to include consultation with appropri¬ 
ate stakeholder organizations in all areas that were 
blacked out on August 14. 

The efforts to restore the power system and cus¬ 
tomer service following the outage were generally 
effective, considering the massive amount of load 
lost and the large number of generators and trans¬ 
mission lines that tripped. Fortunately, the resto¬ 
ration was aided by the ability to energize 
transmission from neighboring systems, thereby 
speeding the recovery. 

Despite the apparent success of the restoration 
effort, it is important to evaluate the results in 
more detail to compare them with previous black¬ 
out/restoration studies and determine opportuni¬ 
ties for improvement. Black start and restoration 
plans are often developed through study of simu¬ 
lated conditions. Robust testing of live systems is 
difficult because of the risk of disturbing the sys¬ 
tem or interrupting customers. The August 14 
blackout provides a valuable opportunity to 
review actual events and experiences to learn how 
to better prepare for system black start and restora¬ 
tion in the future. That opportunity should not be 
lost. 

30. Clarify criteria for identification of 
operationally critical facilities, and 
improve dissemination of updated 
information on unplanned outages. 42 

NERC should work with the control areas and reli¬ 
ability coordinators to clarify the criteria for iden¬ 
tifying critical facilities whose operational status 
can affect the reliability of neighboring areas, and 
to improve mechanisms for sharing information 
about unplanned outages of such facilities in near 
real-time. 

The lack of accurate, near real-time information 
about unplanned outages degraded the perfor¬ 
mance of state estimator and reliability assess¬ 
ment functions on August 14. NERC and the 
industry must improve the mechanisms for shar¬ 
ing outage information in the operating time hori¬ 
zon (e.g., 15 minutes or less), to ensure the 
accurate and timely sharing of outage data needed 
by real-time operating tools such as state 


estimators, real-time contingency analyzers, and 
other system monitoring tools. 

Further, NERC’s present operating policies do not 
specify adequately criteria for identifying those 
critical facilities within reliability coordinator and 
control area footprints whose operating status 
could affect the reliability of neighboring systems. 
This leads to uncertainty about which facilities 
should be monitored by both the reliability coordi¬ 
nator for the region in which the facility is located 
and by one or more neighboring reliability 
coordinators. 


31. Clarify that the transmission loading 
relief (TLR) process should not be used 
in situations involving an actual viola¬ 
tion of an Operating Security Limit. 
Streamline the TLR process. 43 

NERC should clarify that the TLR procedure is 
often too slow for use in situations in which an 
affected system is already in violation of an Oper¬ 
ating Security Limit. NERC should also evaluate 
experience to date with the TLR procedure and 
propose by September 1, 2004, ways to make it less 
cumbersome. _ 

The reviews of control area and reliability coordi¬ 
nator transcripts from August 14 confirm that the 
TLR process is cumbersome, perhaps unnecessar¬ 
ily so, and not fast and predictable enough for use 
situations in which an Operating Security Limit is 
close to or actually being violated. NERC should 
develop an alternative to TLRs that can be used 
quickly to address alert and emergency 
conditions. 

Group III. Physical and Cyber Security 
of North American Bulk Power Systems 

32. Implement NERC IT standards. 

The Task Force recommends that NERC standards 
related to physical and cyber security should be 
understood as being included within the body of 
standards to be made mandatory and enforceable 
in Recommendation No. 1. Further: 

A. NERC should ensure that the industry has 
implemented its Urgent Action Standard 1200; 
finalize, implement, and ensure membership 
compliance with its Reliability Standard 1300 
for Cyber Security and take actions to better 
communicate and enforce these standards. 

B. CAs and RCs should implement existing and 
emerging NERC standards, develop and imple¬ 
ment best practices and policies for IT and 
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security management, and authenticate and 
authorize controls that address EMS automa¬ 
tion system ownership and boundaries. 

Interviews and analyses conducted by the SWG 
indicate that within some of the companies inter¬ 
viewed there are potential opportunities for cyber 
system compromise of EMS and their supporting 
IT infrastructure. Indications of procedural and 
technical IT management vulnerabilities were 
observed in some facilities, such as unnecessary 
software services not denied by default, loosely 
controlled system access and perimeter control, 
poor patch and configuration management, and 
poor system security documentation. 

An analysis of the more prevalent policies and 
standards within the electricity sector revealed 
that there is existing and expanding guidance on 
standards within the sector to perform IT and 
information security management. 44 NERC issued 
a temporary standard (Urgent Action Standard 
1200, Cyber Security) on August 13, 2003, and is 
developing the formal Reliability Standard 1300 
for Cyber Security. Both start the industry down 
the correct path, but there is a need to communi¬ 
cate and enforce these standards by providing the 
industry with recommended implementation 
guidance. Implementation guidance regarding 
these sector-wide standards is especially impor¬ 
tant given that implementation procedures may 
differ among CAs and RCs. 

In order to address the finding described above, 
the Task Force recommends: 

♦ NERC: 

X- Ensure that the industry has implemented its 
Urgent Action Standard 1200 and determine 
if the guidance contained therein needs to be 
strengthened or amended in the ongoing 
development of its Reliability Standard 1300 
for Cyber Security. 

X- Finalize, implement, and ensure member¬ 
ship compliance of its Reliability Standard 
1300 for Cyber Security and take actions to 
better communicate and enforce these stan¬ 
dards. These actions should include, but not 
necessarily be limited to: 

1. The provision of policy, process, and 
implementation guidance to CAs and RCs; 
and 

2. The establishment of mechanisms for com¬ 
pliance, audit, and enforcement. This may 
include recommendations, guidance, or 
agreements between NERC, CAs and RCs 


that cover self-certification, self-assess¬ 
ment, and/or third-party audit. 

>- Work with federal, state, and provincial/terri¬ 
torial jurisdictional departments and agen¬ 
cies to regularly update private and public 
sector standards, policies, and other 
guidance. 

♦ CAs and RCs: 

>■ Implement existing and emerging NERC 
standards. 

Develop and implement best practices and 
policies for IT and security management 
drawing from existing NERC and government 
authorities’ best practices. 45 These should 
include, but not necessarily be limited to: 

1. Policies requiring that automation system 
products be delivered and installed with 
unnecessary services deactivated in order 
to improve “out-of-the-box security.” 

2. The creation of centralized system admin¬ 
istration authority within each CA and RC 
to manage access and permissions for auto¬ 
mation access (including vendor manage¬ 
ment backdoors, links to other automation 
systems, and administrative connections). 

Authenticate and authorize controls that 
address EMS automation system ownership 
and boundaries, and ensure access is granted 
only to users who have corresponding job 
responsibilities. 


33. Develop and deploy IT management 
procedures. 

CAs’ and RCs’ IT and EMS support personnel 
should develop procedures for the development, 
testing, configuration, and implementation of tech¬ 
nology related to EMS automation systems and also 
define and communicate information security and 
performance requirements to vendors on a continu¬ 
ing basis. Vendors should ensure that system 
upgrades, service packs, and bug fixes are made 
available to grid operators in a timely manner. 

Interviews and analyses conducted by the SWG 
indicate that, in some instances, there were 
ill-defined and/or undefined procedures for EMS 
automation systems software and hardware devel¬ 
opment, testing, deployment, and backup. In addi¬ 
tion, there were specific instances of failures to 
perform system upgrade, version control, mainte¬ 
nance, rollback, and patch management tasks. 

At one CA, these procedural vulnerabilities were 
compounded by inadequate, out-of-date, or non- 
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existing maintenance contracts with EMS vendors 
and contractors. This could lead to situations 
where grid operators could alter EMS components 
without vendor notification or authorization as 
well as scenarios in which grid operators are not 
aware of or choose not to implement vendor- 
recommended patches and upgrades. 


34. Develop corporate-level IT security 
governance and strategies. 

CAs and RCs and other grid-related organizations 
should have a planned and documented security 
strategy, governance model, and architecture for 
EMS automation systems. 

Interviews and analysis conducted by the SWG 
indicate that in some organizations there is evi¬ 
dence of an inadequate security policy, gover¬ 
nance model, strategy, or architecture for EMS 
automation systems. This is especially apparent 
with legacy EMS automation systems that were 
originally designed to be stand-alone systems but 
that are now interconnected with internal (corpo¬ 
rate) and external (vendors, Open Access Same 
Time Information Systems (OASIS), RCs, Internet, 
etc.) networks. It should be noted that in some of 
the organizations interviewed this was not the 
case and in fact they appeared to excel in the areas 
of security policy, governance, strategy, and 
architecture. 

In order to address the finding described above, 
the Task Force recommends that CAs, RCs, and 
other grid-related organizations have a planned 
and documented security strategy, governance 
model, and architecture for EMS automation sys¬ 
tems covering items such as network design, sys¬ 
tem design, security devices, access and 
authentication controls, and integrity manage¬ 
ment as well as backup, recovery, and contin¬ 
gency mechanisms. 


35. Implement controls to manage system 
health, network monitoring, and inci¬ 
dent management. 

IT and EMS support personnel should implement 
technical controls to detect, respond to, and 
recover from system and network problems. Grid 
operators, dispatchers, and IT and EMS support 
personnel should be provided the tools and train¬ 
ing to ensure that the health of IT systems is moni¬ 
tored and maintained. 

Interviews and analysis conducted by the SWG 
indicate that in some organizations there was 


ineffective monitoring and control over EMS- 
supporting IT infrastructure and overall IT net¬ 
work health. In these cases, both grid operators 
and IT support personnel did not have situational 
awareness of the health of the IT systems that pro¬ 
vide grid information both globally and locally. 
This resulted in an inability to detect, assess, 
respond to, and recover from IT system-related 
cyber failures (failed hardware/software, mali¬ 
cious code, faulty configurations, etc.). 

In order to address the finding described above, 
the Task Force recommends: 

♦ IT and EMS support personnel implement tech¬ 
nical controls to detect, respond to, and recover 
from system and network problems. 

♦ Grid operators, dispatchers, and IT and EMS 
support personnel be provided with the tools 
and training to ensure that: 

The health of IT systems is monitored and 
maintained. 

>- These systems have the capability to be 
repaired and restored quickly, with a mini¬ 
mum loss of time and access to global and 
internal grid information. 

>- Contingency and disaster recovery proce¬ 
dures exist and can serve to temporarily sub¬ 
stitute for systems and communications 
failures during times when EMS automation 
system health is unknown or unreliable. 

x* Adequate verbal communication protocols 
and procedures exist between operators and 
IT and EMS support personnel so that opera¬ 
tors are aware of any IT-related problems that 
may be affecting their situational awareness 
of the power grid. 


36. Initiate a U.S.-Canada risk manage¬ 
ment study. 

In cooperation with the electricity sector, federal 
governments should strengthen and expand the 
scope of the existing risk management initiatives 
by undertaking a bilateral (Canada-U.S.) study of 
the vulnerabilities of shared electricity infrastruc¬ 
ture and cross border interdependencies. Common 
threat and vulnerability assessment methodologies 
should be also developed, based on the work 
undertaken in the pilot phase of the current joint 
Canada-U.S. vulnerability assessment initiative, 
and their use promoted by CAs and RCs. To coin¬ 
cide with these initiatives, the electricity sector, in 
association with federal governments, should 
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develop policies and best practices for effective 
risk management and risk mitigation. 

Effective risk management is a key element in 
assuring the reliability of our critical infrastruc¬ 
tures. It is widely recognized that the increased 
reliance on IT by critical infrastructure sectors, 
including the energy sector, has increased the 
vulnerability of these systems to disruption via 
cyber means. The breadth of the August 14, 2003, 
power outage illustrates the vulnerabilities and 
interdependencies inherent in our electricity 
infrastructure. 

Canada and the United States, recognizing the 
importance of assessing the vulnerabilities of 
shared energy systems, included a provision to 
address this issue in the Smart Border Declara¬ 
tion, 46 signed on December 12, 2001. Both coun¬ 
tries committed, pursuant to Action Item 21 of the 
Declaration, to “conduct bi-national threat assess¬ 
ments on trans-border infrastructure and identify 
necessary protection measures, and initiate 
assessments for transportation networks and other 
critical infrastructure.” These joint assessments 
will serve to identify critical vulnerabilities, 
strengths and weaknesses while promoting the 
sharing and transfer of knowledge and technology 
to the energy sector for self-assessment purposes. 

A team of Canadian and American technical 
experts, using methodology developed by the 
Argonne National Laboratory in Chicago, Illinois, 
began conducting the pilot phase of this work in 
January 2004. The work involves a series of joint 
Canada-U.S. assessments of selected shared criti¬ 
cal energy infrastructure along the Canada-U.S. 
border, including the electrical transmission lines 
and dams at Niagara Falls - Ontario and New York. 
The pilot phase will be completed by March 31, 
2004. 

The findings of the ESWG and SWG suggest that 
among the companies directly involved in the 
power outage, vulnerabilities and interdependen¬ 
cies of the electric system were not well under¬ 
stood and thus effective risk management was 
inadequate. In some cases, risk assessments did 
not exist or were inadequate to support risk man¬ 
agement and risk mitigation plans. 

In order to address these findings, the Task Force 
recommends: 

♦ In cooperation with the electricity sector, fed¬ 
eral governments should strengthen and 
expand the scope of the existing initiatives 
described above by undertaking a bilateral 


(Canada-U.S.) study of the vulnerabilities of 
shared electricity infrastructure and cross bor¬ 
der interdependencies. The study should 
encompass cyber, physical, and personnel 
security processes and include mitigation and 
best practices, identifying areas that would ben¬ 
efit from further standardization. 

♦ Common threat and vulnerability assessment 
methodologies should be developed, based on 
the work undertaken in the pilot phase of the 
current joint Canada-U.S. vulnerability assess¬ 
ment initiative, and their use promoted by CAs 
and RCs. 

♦ The electricity sector, in association with fed¬ 
eral governments, should develop policies and 
best practices for effective risk management and 
risk mitigation. 


37. Improve IT forensic and diagnostic 
capabilities. 

CAs and RCs should seek to improve internal 
forensic and diagnostic capabilities, ensure that IT 
support personnel who support EMS automation 
systems are familiar with the systems’ design and 
implementation, and make certain that IT support 
personnel who support EMS automation systems 
have are trained in using appropriate tools for 
diagnostic and forensic analysis and remediation. 

Interviews and analyses conducted by the SWG 
indicate that, in some cases, IT support personnel 
who are responsible for EMS automation systems 
are unable to perform forensic and diagnostic rou¬ 
tines on those systems. This appears to stem from 
a lack of tools, documentation and technical skills. 
It should be noted that some of the organizations 
interviewed excelled in this area but that overall 
performance was lacking. 

In order to address the finding described above, 
the Task Force recommends: 

♦ CAs and RCs seek to improve internal forensic 
and diagnostic capabilities as well as strengthen 
coordination with external EMS vendors and 
contractors who can assist in servicing EMS 
automation systems; 

♦ CAs and RCs ensure that IT support personnel 
who support EMS automation systems are 
familiar with the systems’ design and imple¬ 
mentation; and 

♦ CAs and RCs ensure that IT support personnel 
who support EMS automation systems have 
access to and are trained in using appropriate 
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tools for diagnostic and forensic analysis and 
remediation. 


38. Assess IT risk and vulnerability at 
scheduled intervals. 

IT and EMS support personnel should perform reg¬ 
ular risk and vulnerability assessment activities 
for automation systems (including EMS applica¬ 
tions and underlying operating systems) to identify 
weaknesses, high-risk areas, and mitigating actions 
such as improvements in policy, procedure, and 
technology. _ 

Interviews and analysis conducted by the SWG 
indicate that in some instances risk and vulnera¬ 
bility management were not being performed on 
EMS automation systems and their IT supporting 
infrastructure. To some CAs, EMS automation sys¬ 
tems were considered “black box” 47 technologies; 
and this categorization removed them from the list 
of systems identified for risk and vulnerability 
assessment. 


39. Develop capability to detect wireless 
and remote wireline intrusion and 
surveillance. 

Both the private and public sector should promote 
the development of the capability of all CAs and 
RCs to reasonably detect intrusion and surveil¬ 
lance of wireless and remote wireline access points 
and transmissions. CAs and RCs should also con¬ 
duct periodic reviews to ensure that their user base 
is in compliance with existing wireless and remote 
wireline access rules and policies. 

Interviews conducted by the SWG indicate that 
most of the organizations interviewed had some 
type of wireless and remote wireline intrusion and 
surveillance detection protocol as a standard secu¬ 
rity policy; however, there is a need to improve 
and strengthen current capabilities regarding 
wireless and remote wireline intrusion and sur¬ 
veillance detection. The successful detection and 
monitoring of wireless and remote wireline access 
points and transmissions are critical to securing 
grid operations from a cyber security perspective. 

There is also evidence that although many of the 
organizations interviewed had strict policies 
against allowing wireless network access, periodic 
reviews to ensure compliance with these policies 
were not undertaken. 


40. Control access to operationally sensi¬ 
tive equipment. 

RCs and CAs should implement stringent policies 
and procedures to control access to sensitive equip¬ 
ment and/or work areas. 

Interviews conducted by the SWG indicate that 
at some CAs and RCs operationally sensitive 
computer equipment was accessible to non- 
essential personnel. Although most of these non- 
essential personnel were escorted through sensi¬ 
tive areas, it was determined that this procedure 
was not always enforced as a matter of everyday 
operations. 

In order to address the finding described above, 
the Task Force recommends: 

♦ That RCs and CAs develop policies and proce¬ 
dures to control access to sensitive equipment 
and/or work areas to ensure that: 

>- Access is strictly limited to employees or con¬ 
tractors who utilize said equipment as part of 
their job responsibilities. 

>■ Access for other staff who need access to sen¬ 
sitive areas and/or equipment but are not 
directly involved in their operation (such as 
cleaning staff and other administrative per¬ 
sonnel) is strictly controlled (via escort) and 
monitored. 


41. NERC should provide guidance on 
employee background checks. 

NERC should provide guidance on the implementa¬ 
tion of its recommended standards on background 
checks, and CAs and RCs should review their poli¬ 
cies regarding background checks to ensure they 
are adequate. 

Interviews conducted with sector participants 
revealed instances in which certain company con¬ 
tract personnel did not have to undergo back¬ 
ground check(s) as stringent as those performed 
on regular employees of a CA or RC. NERC Urgent 
Action Standard Section 1207 Paragraph 2.3 spec¬ 
ifies steps to remediate sector weaknesses in this 
area but there is a need to communicate and 
enforce this standard by providing the industry 
with recommended implementation guidance, 
which may differ among CAs and RCs. 

In order to address the finding described above, 
the Task Force recommends: 
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♦ NERC provide guidance on the implementation 
of its recommended standards on background 
checks, especially as they relate to the screening 
of contracted and sub-contracted personnel. 

♦ CAs and RCs review their policies regarding 
background checks to ensure they are adequate 
before allowing sub-contractor personnel to 
access their facilities. 


42. Confirm NERC ES-ISAC as the central 
point for sharing security information 
and analysis. 

The NERC ES-ISAC should be confirmed as the 
central electricity sector point of contact for secu¬ 
rity incident reporting and analysis. Policies and 
protocols for cyber and physical incident reporting 
should be further developed including a mecha¬ 
nism for monitoring compliance. There also should 
be uniform standards for the reporting and sharing 
of physical and cyber security incident information 
across both the private and public sectors. 

There are currently both private and public sector 
information sharing and analysis initiatives in 
place to address the reporting of physical and 
cyber security incidents within the electricity sec¬ 
tor. In the private sector, NERC operates an Elec¬ 
tricity Sector Information Sharing and Analysis 
Center (ES-ISAC) specifically to address this 
issue. On behalf of the U.S. Government, the 
Department of Homeland Security (DHS) operates 
the Information Analysis and Infrastructure Pro¬ 
tection (IAIP) Directorate to collect, process, and 
act upon information on possible cyber and physi¬ 
cal security threats and vulnerabilities. In Canada, 
Public Safety and Emergency Preparedness Can¬ 
ada has a 24/7 operations center for the reporting 
of incidents involving or impacting critical infra¬ 
structure. As well, both in Canada and the U.S., 
incidents of a criminal nature can be reported to 
law enforcement authorities of jurisdiction. 

Despite these private and public physical and 
cyber security information sharing and analysis 
initiatives, an analysis of policies and procedures 
within the electricity sector reveals that reporting 
of security incidents to internal corporate secu¬ 
rity, law enforcement, or government agencies 
was uneven across the sector. The fact that these 
existing channels for incident reporting—whether 
security- or electricity systems-related—are cur¬ 
rently underutilized is an operating deficiency 
which could hamper the industry’s ability to 
address future problems in the electricity sector. 


Interviews and analysis conducted by the SWG 
further indicate an absence of coherent and effec¬ 
tive mechanisms for the private sector to share 
information related to critical infrastructure with 
government. There was also a lack of confidence 
on the part of private sector infrastructure owners 
and grid operators that information shared with 
governments could be protected from disclosure 
under Canada’s Access to Information Act (ATIA) 
and the U.S. Freedom of Information Act (FOIA). 
On the U.S. side of the border, however, the immi¬ 
nent implementation of the Critical Infrastructure 
Information (CII) Act of 2002 should mitigate 
almost all industry concerns about FOIA disclo¬ 
sure. In Canada, Public Safety and Emergency Pre¬ 
paredness Canada relies on a range of mechanisms 
to protect the sensitive information related to criti¬ 
cal infrastructure that it receives from its private 
sector stakeholders, including the exemptions for 
third party information that currently exist in the 
ATIA and other instruments. At the same time, 
Public Safety and Emergency Preparedness Can¬ 
ada is reviewing options for stronger protection of 
Cl information, including potential changes in 
legislation. 

In order to address the finding described above, 
the Task Force recommends: 

♦ Confirmation of the NERC ES-ISAC as the cen¬ 
tral electricity sector point of contact for secu¬ 
rity incident reporting and analysis. 

♦ Further development of NERC policies and pro¬ 
tocols for cyber and physical incident reporting 
including a mechanism for monitoring 
compliance. 

♦ The establishment of uniform standards for the 
reporting of physical and cyber security inci¬ 
dents to internal corporate security, private sec¬ 
tor sector-specific information sharing and 
analysis bodies (including ISACs), law enforce¬ 
ment, and government agencies. 

♦ The further development of new mechanisms 
and the promulgation of existing 48 Canadian 
and U.S. mechanisms to facilitate the sharing of 
electricity sector threat and vulnerability infor¬ 
mation across governments as well as between 
the private sector and governments. 

♦ Federal, state, and provincial/territorial govern¬ 
ments work to further develop and promulgate 
measures and procedures that protect critical, 
but sensitive, critical infrastructure-related 
information from disclosure. 
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43. Establish clear authority for physical 
and cyber security. 

The task force recommends that corporations 
establish clear authority and ownership for 
physical and cyber security. This authority 
should have the ability to influence 
corporate decision-making and the authority 
to make physical and cyber security-related 
decisions. 

Interviews and analysis conducted by the SWG 
indicate that some power entities did not imple¬ 
ment best practices when organizing their security 
staff. It was noted at several entities that the Infor¬ 
mation System (IS) security staff reported to IT 
support personnel such as the Chief Information 
Officer (CIO). 

Best practices across the IT industry, including 
most large automated businesses, indicate that the 
best way to balance security requirements prop¬ 
erly with the IT and operational requirements of a 
company is to place security at a comparable level 
within the organizational structure. By allowing 
the security staff a certain level of autonomy, man¬ 
agement can properly balance the associated risks 
and operational requirements of the facility. 


44. Develop procedures to prevent or miti¬ 
gate inappropriate disclosure of infor¬ 
mation. 

The private and public sectors should jointly 
develop and implement security procedures and 
awareness training in order to mitigate or prevent 
disclosure of information by the practices of open 
source collection, elicitation, or surveillance. 

SWG interviews and intelligence analysis provide 
no evidence of the use of open source collection, 
elicitation or surveillance against CAs or RCs lead¬ 
ing up to the August 14, 2003, power outage. How¬ 
ever, such activities may be used by malicious 
individuals, groups, or nation states engaged in 
intelligence collection in order to gain insights or 
proprietary information on electric power system 
functions and capabilities. Open source collection 
is difficult to detect and thus is best countered 
through careful consideration by industry stake¬ 
holders of the extent and nature of pub- 
licly-available information. Methods of elicitation 
and surveillance, by comparison, are more detect¬ 
able activities and may be addressed through 
increased awareness and security training. In 
addition to prevention and detection, it is equally 
important that suspected or actual incidents of 


these intelligence collection activities be reported 
to government authorities. 

In order to address the findings described above, 
the Task Force recommends: 

♦ The private and public sectors jointly develop 
and implement security procedures and aware¬ 
ness training in order to mitigate disclosure of 
information not suitable for the public domain 
and/or removal of previously available informa¬ 
tion in the public domain (web sites, message 
boards, industry publications, etc.). 

♦ The private and public sector jointly develop 
and implement security procedures and aware¬ 
ness training in order to mitigate or prevent dis¬ 
closure of information by the practices of 
elicitation. 

♦ The private and public sector jointly develop 
and implement security procedures and aware¬ 
ness training in order to mitigate, prevent, and 
detect incidents of surveillance. 

♦ Where no mechanism currently exists, the pri¬ 
vate and public sector jointly establish a secure 
reporting chain and protocol for use of the infor¬ 
mation for suspected and known attempts and 
incidents of elicitation and surveillance. 

Group IV. Canadian 
Nuclear Power Sector 

The U.S. nuclear power plants affected by the 
August 14 blackout performed as designed. After 
reviewing the design criteria and the response of 
the plants, the U.S. members of the Nuclear 
Working Group had no recommendations relative 
to the U.S. nuclear power plants. 

As discussed in Chapter 8, Canadian nuclear 
power plants did not trigger the power system out¬ 
age or contribute to its spread. Rather, they dis¬ 
connected from the grid as designed. The 
Canadian members of the Nuclear Working Group 
have, therefore, no specific recommendations 
with respect to the design or operation of Cana¬ 
dian nuclear plants that would improve the reli¬ 
ability of the Ontario electricity grid. The 
Canadian Nuclear Working Group, however, 
made two recommendations to improve the 
response to future events involving the loss of 
off-site power, one concerning backup electrical 
generation equipment to the CNSC’s Emergency 
Operations Centre and another concerning the use 
of adjuster rods during future events involving the 
loss of off-site power. The Task Force accepted 
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these recommendations, which are presented 
below. 


45. The Task Force recommends that the 
Canadian Nuclear Safety Commission 
request Ontario Power Generation and 
Bruce Power to review operating pro¬ 
cedures and operator training associ¬ 
ated with the use of adjuster rods. 

OPG and Bruce Power should review their operat¬ 
ing procedures to see whether alternative proce¬ 
dures could be put in place to carry out or reduce 
the number of system checks required before plac¬ 
ing the adjuster rods into automatic mode. This 
review should include an assessment of any regula¬ 
tory constraints placed on the use of the adjuster 
rods, to ensure that risks are being appropriately 
managed. _ 

Current operating procedures require independ¬ 
ent checks of a reactor’s systems by the reactor 
operator and the control room supervisor before 
the reactor can be put in automatic mode to allow 
the reactors to operate at 60% power levels. Alter¬ 
native procedures to allow reactors to run at 60% 
of power while waiting for the grid to be 
re-established may reduce other risks to the health 
and safety of Ontarians that arise from the loss of a 
key source of electricity. CNSC oversight and 
approval of any changes to operating procedures 
would ensure that health and safety, security, or 
the environment are not compromised. The CNSC 
would assess the outcome of the proposed review 
to ensure that health and safety, security, and the 
environment would not be compromised as a 
result of any proposed action. 


46. The Task Force recommends that the 
Canadian Nuclear Safety Commission 
purchase and install backup genera¬ 
tion equipment. 

In order to ensure that the CNSC’s Emergency 
Operations Center (EOC) is available and fully 
functional during an emergency situation requiring 
CNSC response, whether the emergency is 
nuclear-related or otherwise, and that staff needed 
to respond to the emergency can be accommodated 
safely, the CNSC should have backup electrical 
generation equipment of sufficient capacity to pro¬ 
vide power to the EOC, telecommunications and 
Information Technology (IT) systems and accom¬ 
modations for the CNSC staff needed to respond to 
an emergency. 


The August 2003 power outage demonstrated that 
the CNSC’s Emergency Operations Center, IT, and 
communications equipment are vulnerable if 
there is a loss of electricity to the Ottawa area. 

Endnotes 

1 In fairness, it must be noted that reliability organizations in 
some areas have worked diligently to implement recommen¬ 
dations from earlier blackouts. According to the Initial Report 
by the New York State Department of Public Service on the 
August 14, 2003 Blackout, New York entities implemented all 
100 of the recommendations issued after the New York City 
blackout of 1977. 

2 The need for a systematic recommitment to reliability by 
all affected organizations was supported in various ways by 
many commenters on the Interim Report, including Anthony 
J. Alexander, FirstEnergy; David Barrie, Hydro One Networks, 
Inc.; Joseph P. Carson, P.E.; Harrison Clark; F. J. Delea, J.A. 
Casazza, G.C. Loehr, and R. M. Malizewski, Power Engineers 
Seeking Truth; Ajay Garg and Michael Penstone, Hydro One 
Networks, Inc.; and Raymond K. Kershaw, International 
Transmission Company. 

3 See supporting comments expressed by Anthony J. Alex¬ 
ander, FirstEnergy; Deepak Divan, SoftSwitching Technol¬ 
ogies; Pierre Guimond, Canadian Nuclear Association; Hans 
Konow, Canadian Electricity Association; Michael Penstone, 
Hydro One Networks, Inc.; and James K. Robinson, PPL. 

4 See “The Economic Impacts of the August 2003 Blackout,” 
Electric Consumers Resource Council (ELCON), February 2, 
2004. 

5 The need for action to make standards enforceable was 
supported by many commenters, including David Barrie, 
Hydro One Networks, Inc.; Carl Burrell, IMO Ontario; David 
Cook, North American Electric Reliability Council; Deepak 
Divan, SoftSwitching Technologies; Charles J. Durkin, North¬ 
east Power Coordinating Council; David Goffin, Canadian 
Chemical Producers’ Association; Raymond K. Kershaw, 
International Transmission Company; Hans Konow, Cana¬ 
dian Electricity Association; Barry Lawson, National Rural 
Electric Cooperative Association; William J. Museler, New 
York Independent System Operator; Eric B. Stephens, Ohio 
Consumers’ Counsel; Gordon Van Welie, ISO New England, 
Inc.; and C. Dortch Wright, on behalf of James McGreevey, 
Governor of New Jersey. 

6 This recommendation was suggested by some members of 
the Electric System Working Group. 

7 The need to evaluate and where appropriate strengthen the 
institutional framework for reliability management was sup¬ 
ported in various respects by many commenters, including 
Anthony J. Alexander, FirstEnergy Corporation; David Barrie, 
Hydro One Networks, Inc.; Chris Booth, Experienced Consul¬ 
tants LLC; Carl Burrell, IMO Ontario;Linda Campbell, Florida 
Reliability Coordinating Council; Linda Church Ciocci, 
National Hydropower Association; David Cook, NERC; F.J. 
Delea, J.A. Casazza, G.C. Loehr, and R.M. Malizewski, Power 
Engineers Seeking Truth; Charles J. Durkin, Northeast Power 
Coordinating Council; Ajay Garg and Michael Penstone, 
Hydro One Networks, Inc.; Michael W. Golay, Massachusetts 
Institute of Technology; Leonard S. Hyman, Private Sector 
Advisors, Inc; Marija Ilic, Carnegie Mellon University; Jack 
Kerr, Dominion Virginia Power; Raymond K. Kershaw, 
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International Transmission Company; Paul Kleindorfer, Uni¬ 
versity of Pennsylvania; Michael Kormos, PJM Interconnec¬ 
tion; Bill Mittelstadt, Bonneville Power Administration; 
William J. Museler, New York Independent System Operator; 
James K. Robinson, PPL; Eric B. Stephens, Ohio Consumers’ 
Counsel; John Synesiou, IMS Corporation; Gordon Van 
Welie, ISO New England; Vickie Van Zandt, Bonneville 
Power Administration; and C. Dortch Wright, on behalf of 
James McGreevey, Governor of New Jersey. 

8 Several commenters noted the importance of clarifying 
that prudently incurred reliability expenses and investments 
will be recoverable through regulator-approved rates. These 
commenters include Anthony J. Alexander, FirstEnergy Cor¬ 
poration; Deepak Divan, SoftSwitching Technologies; Ste¬ 
phen Fairfax, MTechnology, Inc.; Michael W. Golay, 
Massachusetts Institute of Technology; Pierre Guimond, 
Canadian Nuclear Association; Raymond K. Kershaw, Inter¬ 
national Transmission Company; Paul R. Kleindorfer, Uni¬ 
versity of Pennsylvania: Hans Konow, Canadian Electricity 
Association; Barry Lawson, National Rural Electric Coopera¬ 
tive Association; and Michael Penstone, Hydro One Net¬ 
works, Inc. 

9 The concept of an ongoing NERC process to track the 
implementation of existing and subsequent recommenda¬ 
tions was initated by NERC and broadened by members of the 
Electric System Working Group. See comments by David 
Cook, North American Electric Reliability Council. 

10 This recommendation was suggested by NERC and sup¬ 
ported by members of the Electric System Working Group. 

11 See comments by Jack Kerr, Dominion Virginia Power, and 
Margie Phillips, Pennsylvania Services Integration 
Consortium. 

12 The concept of a “reliability impact consideration” was 
suggested by NERC and supported by the Electric System 
Working Group. 

13 The suggestion that EIA should become a source of reliabil¬ 
ity data and information came from a member of the Electric 
System Working Group. 

14 Several commenters raised the question of whether there 
was a linkage between the emergence of competition (or 
increased wholesale electricity trade) in electricity markets 
and the August 14 blackout. See comments by Anthony J. 
Alexander, FirstEnergy Corporation; F.J. Delea, J.A. Casazza, 
G.C. Loehr, and R.M. Malizewski, Power Engineers Seeking 
Truth; Ajay Garg and Michael Penstone, Hydro One Net¬ 
works, Inc.; Brian O’Keefe, Canadian Union of Public 
Employees; Les Pereira; and John Wilson. 

15 NIMBY: “Not In My Back Yard.” 

16 Several commenters either suggested that government 
agencies should expand their research in reliability-related 
topics, or emphasized the need for such R&D more generally. 
See comments by Deepak Divan, SoftSwitching Technol¬ 
ogies; Marija Ilic, Carnegie Mellon University; Hans Konow, 
Canadian Electricity Association; Stephen Lee, Electric 
Power Research Institute; James K. Robinson, PPL; John 
Synesiou, IMS Corporation; and C. Dortch Wright on behalf of 
Governor James McGreevey of New Jersey. 

17 The concept of a standing framework for grid-related 
investigations was initiated by members of the Electric Sys¬ 
tem Working Group, after noting that the U.S. National Aero¬ 
nautics and Space Administration (NASA) had created a 
similar arrangement after the Challenger explosion in 1986. 
This framework was put to use immediately after the loss of 
the shuttle Columbia in 2003. 


18 This subject was addressed in detail in comments by David 
Cook, North American Electric Reliability Council; and in 
part by comments by Anthony J. Alexander, FirstEnergy Cor¬ 
poration; Ajay Garg, Hydro One Networks, Inc.; George 
Katsuras, IMO Ontario; and Vickie Van Zandt, Bonneville 
Power Administration. 

19 U.S. Federal Energy Regulatory Commission, 105 FERC H 
61,372, December 24, 2003. 

20 See ECAR website, 

http://www.ecar.org/documents/document%201_6-98.pdf. 

21 See NERC website, http://www.nerc.com/standards/. 

22 The need to ensure better maintenance of required electri¬ 
cal clearances in transmission right of way areas was empha¬ 
sized by several commenters, including Richard E. Abbott, 
arborist; Anthony J. Alexander, FirstEnergy Corporation; 
David Barrie, Hydro One Networks, Inc.; David Cook, North 
American Electric Reliability Council; Ajay Garg and Michael 
Penstone, Hydro One Networks, Inc.; Tadashi Mano, Tokyo 
Electric Power Company; Eric B. Stephens, Ohio Consumers’ 
Counsel; Vickie Van Zandt, Bonneville Power Administra¬ 
tion; and Donald Wightman, Utility Workers Union of 
America. 

23 Utility Vegetation Management Final Report, CN Utility 
Consulting, LLC, March 2004, commissioned by the U.S. Fed¬ 
eral Energy Regulatory Commission to support the investiga¬ 
tion of the August 14, 2003 blackout. 

24 The need to strengthen and verify compliance with NERC 
standards was noted by several commenters. See comments 
by David Barrie, Hydro One Networks, Inc.; Carl Burrell, IMO 
Ontario; David Cook, North American Electric Reliability 
Council; and Eric B. Stephens, Ohio Consumers’ Counsel. 

25 The need to verify application of NERC standards via 
readiness audits—before adverse incidents occur—was noted 
by several commenters. See comments by David Barrie, 
Hydro One Networks, Inc.; David Cook, North American Elec¬ 
tric Reliability Council; Barry Lawson, National Rural Electric 
Cooperative Association; Bill Mittelstadt, Bonneville Power 
Administration; and Eric B. Stephens, Ohio Consumers’ 
Counsel. 

28 The need to improve the training and certification require¬ 
ments for control room management and staff drew many 
comments. See comments by David Cook, North American 
Electric Reliability Council; F.J. Delea, J.A. Casazza, G.C. 
Loehr, and R.M. Malizewski, Power Engineers Seeking Truth; 
Victoria Doumtchenko, MPR Associates; Pat Duran, IMO 
Ontario; Ajay Garg and Michael Penstone, Hydro One Net¬ 
works, Inc.; George Katsuras, IMO Ontario; Jack Kerr, Domin¬ 
ion Virginia Power; Tim Kucey, National Energy Board, 
Canada; Stephen Lee, Electric Power Research Institute; Steve 
Leovy, personal comment; Ed Schwerdt, Northeast Power 
Coordinating Council; Tapani O. Seppa, The Valley Group, 
Inc.; Eric B. Stephens, Ohio Consumers’ Counsel; Vickie Van 
Zandt, Bonneville Power Company; Don Watkins, Bonneville 
Power Administration; and Donald Wightman, Utility 
Workers Union of America. 

27 This reliance, and the risk of an undue dependence, is 
often unrecognized in the industry. 

28 Many parties called for clearer statement of the roles, 
responsibilities, and authorities of control areas and reliabil¬ 
ity coordinators, particularly in emergency situations. See 
comments by Anthony J. Alexander, FirstEnergy Corporation; 
Chris Booth, Experienced Consultants LLC; Michael 
Calimano, New York ISO; Linda Campbell, Florida Reliability 
Coordinating Council; David Cook, North American Electric 
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Reliability Council; F.J. Delea, J.A. Casazza, G.C. Loehr, and 
R.M. Malizewski, Power Engineers Seeking Truth; Mark 
Fidrych, Western Area Power Authority; Ajay Garg and 
Michael Penstone, Hydro One Networks, Inc.; Carl Hauser, 
Washington State University; Stephen Kellat; Jack Kerr, 
Dominion Virginia Power; Raymond K. Kershaw, Interna¬ 
tional Transmission Company; Michael Kormos, PJM Inter¬ 
connection; William J. Museler, New York Independent 
System Operator; Tapani O. Seppa, The Valley Group, Inc.; 
John Synesiou, IMS Corporation; Gordon Van Welie, ISO 
New England, Inc.; Vickie Van Zandt, Bonneville Power 
Administration; Kim Warren, IMO Ontario; and Tom 
Wiedman, Consolidated Edison. Members of the Electric Sys¬ 
tem Working Group initiated the concept of defining an 
“alert” status, between “normal” and “emergency,” and asso¬ 
ciated roles, responsibilities, and authorities. 

29 The need to make better use of system protection measures 
received substantial comment, including comments by James 
L. Blasiak, International Transmission Company; David Cook, 
North American Electric Reliability Council; Charles J. 
Durkin, Northeast Power Coordinating Council; F.J. Delea, 
J.A. Casazza, G.C. Loehr, and R.M. Malizewski, Power Engi¬ 
neers Seeking Truth; Ajay Garg and Michael Penstone, Hydro 
One Networks, Inc.; Gurgen and Spartak Hakobyan, personal 
study; Marija Ilic, Carnegie Mellon University; Shinichi Imai, 
Tokyo Electric Power Company; Jack Kerr, Dominion Virginia 
Power; Stephen Lee, Electric Power Research Institute; Ed 
Schwerdt, Northeast Power Coordinating Council; Robert 
Stewart, PG&E; Philip Tatro, National Grid Company; Carson 
Taylor, Bonneville Power Administration; Vickie Van Zandt, 
Bonneville Power Company; Don Watkins, Bonneville Power 
Administration; and Tom Wiedman, Consolidated Edison. 

30 The subject of developing and adopting better real-time 
tools for control room operators and reliability coordinators 
drew many comments, including those by Anthony J. Alexan¬ 
der, FirstEnergy Corporation; Eric Allen, New York ISO; Chris 
Booth, Experienced Consultants, LLC; Mike Calimano, New 
York ISO; Claudio Canizares, University of Waterloo 
(Ontario); David Cook, North American Electric Reliability 
Council; Deepak Divan, SoftSwitching Technologies Victoria 
Doumtchenko, MPR Associates; Pat Duran, IMO Ontario; Bill 
Eggertson, Canadian Association for Renewable Energies; 
Ajay Garg and Michael Penstone, Hydro One Networks, Inc.; 
Jack Kerr, Dominion Virginia Power; Raymond K. Kershaw, 
International Transmission Company; Michael Kormos, PJM 
Interconnection; Tim Kucey, National Energy Board, Canada; 
Steve Lapp, Lapp Renewables; Stephen Lee, Electric Power 
Research Institute; Steve Leovy; Tom Levy; Peter Love, Cana¬ 
dian Energy Efficiency Alliance; Frank Macedo, Hydro One 
Networks, Inc.; Bill Mittelstadt, Bonneville Power Adminis¬ 
tration; Fiona Oliver, Canadian Energy Efficiency Alliance; 
Peter Ormund, Mohawk College; Don Ross, Prince Edward 
Island Wind Co-op Limited; James K. Robinson, PPL; Robert 
Stewart, PG&E; John Synesiou, IMS Corporation; Gordon Van 
Welie, ISO New England, Inc.; Vickie Van Zandt, Bonneville 
Power Administration; Don Watkins, Bonneville Power 
Administration; Chris Winter, Conservation Council of 
Ontario; David Zwergel, Midwest ISO. The concept of requir¬ 
ing annual testing and certification of operators’ EMS and 
SCADA systems was initiated by a member of the Electric 
System Working Group. Also, see comments by John 
Synesiou, IMS Corporation. 

31 The need to strengthen reactive power and voltage control 
practices was the subject of several comments. See comments 
by Claudio Canizares, University of Waterloo (Ontario); 
David Cook, North American Electric Reliability Council; F.J. 


Delea, J.A. Casazza, G.C. Loehr, and R.M. Malizewski, Power 
Engineers Seeking Truth; Stephen Fairfax, MTechnology, 
Inc.; Ajay Garg and Michael Penstone, Hydro One Networks, 
Inc.; Shinichi Imai and Toshihiko Furuya, Tokyo Electric 
Power Company; Marija Ilic, Carnegie Mellon University; 
Frank Macedo, Hydro One Networks, Inc.; and Tom 
Wiedman, Consolidated Edison. Several commenters 
addressed issues related to the production of reactive power 
by producers of power for sale in wholesale markets. See com¬ 
ments by Anthony J. Alexander, FirstEnergy Corporation; 
K.K. Das, PowerGrid Corporation of India, Limited; F.J. Delea, 
J.A. Casazza, G.C. Loehr, and R.M. Malizewski, Power Engi¬ 
neers Seeking Truth; Stephen Fairfax, MTechnology, Inc.; 
and Carson Taylor, Bonneville Power Administration. 

32 See pages 107-108. 

33 U.S. Federal Energy Regulatory Commission, 105 FERC H 
61,372, December 24, 2003. 

34 The need to improve the quality of system modeling data 
and data exchange practices received extensive comment. See 
comments from Michael Calimano, New York ISO; David 
Cook, North American Electric Reliability Council; Robert 
Cummings, North American Electric Reliability Council; F.J. 
Delea, J.A. Casazza, G.C. Loehr, and R.M. Malizewski, Power 
Engineers Seeking Truth; Mark Fidrych, Western Area Power 
Administration; Jack Kerr, Dominion Virginia Power; Ray¬ 
mond K. Kershaw, International Transmission Company; 
Frank Macedo, Hydro One Networks, Inc.; Vickie Van Zandt, 
Bonneville Power Administration; Don Watkins, Bonneville 
Power Administration; and David Zwergel, Midwest ISO. 

35 Several commenters addressed the subject of NERC’s stan¬ 
dards in various respects, including Anthony J. Alexander, 
FirstEnergy Corporation; Carl Burrell, IMO Ontario; David 
Cook, North American Electric Reliability Council; F.J. Delea, 
J.A. Casazza, G.C. Loehr, and R.M. Malizewski, Power Engi¬ 
neers Seeking Truth; Charles J. Durkin, Northeast Power 
Coordinating Council; Ajay Garg and Michael Penstone, 
Hydro One Networks, Inc.; Jack Kerr, Dominion Virginia 
Power; James K. Robinson, PPL; Mayer Sasson, New York 
State Reliability Council; and Kim Warren, IMO Ontario. 

36 See Initial Report by the New York State Department of Pub¬ 
lic Service on the August 14, 2003 Blackout (2004), and com¬ 
ments by Mayer Sasson, New York State Reliability Council. 

37 F.J. Delea, J.A. Casazza, G.C. Loehr, and R.M. Malizewski, 
“The Need for Strong Planning and Operating Criteria to 
Assure a Reliable Bulk Power Supply System,” January 29, 
2004. 

38 The need to tighten communications protocols and 
improve communications systems was cited by several 
commenters. See comments by Anthony J. Alexander, 
FirstEnergy Corporation; David Barrie, Hydro One Networks, 
Inc.; Carl Burrell, IMO Ontario; Michael Calimano, New York 
ISO; David Cook, North American Electric Reliability Coun¬ 
cil; Mark Fidrych, Western Area Power Administration; Ajay 
Garg and Michael Penstone, Hydro One Networks, Inc.; Jack 
Kerr, Dominion Virginia Power; William Museler, New York 
ISO; John Synesiou, IMS Corporation; Vickie Van Zandt, 
Bonneville Power Administration; Don Watkins, Bonneville 
Power Administration; Tom Wiedman, Consolidated Edison. 

39 See comments by Tapani O. Seppa, The Valley Group, Inc. 

40 Several commenters noted the need for more systematic 
use of time-synchronized data recorders. In particular, see 
David Cook, North American Electric Reliability Council; 
Ajay Garg and Michael Penstone, Hydro One Networks, Inc.; 
and Robert Stewart, PG&E. 
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41 The importance of learning from the system restoration 
experience associated with the August 14 blackout was 
stressed by Linda Church Ciocci, National Hydropower Asso¬ 
ciation; David Cook, North American Electric Reliability 
Council; Frank Delea; Bill Eggertson, Canadian Association 
for Renewable Energies; Stephen Lee, Electric Power 
Research Institute; and Kim Warren, IMO Ontario. 

42 The need to clarify the criteria for identifying critical facili¬ 
ties and improving dissemination of updated information 
about unplanned outages was cited by Anthony J. Alexander, 
FirstEnergy Corporation; and Raymond K. Kershaw, Interna¬ 
tional Transmission Company. 

43 The need to streamline the TLR process and limit the use of 
it to non-urgent situations was discussed by several 
commenters, including Anthony J. Alexander, FirstEnergy 
Corporation; Carl Burrell, IMO Ontario; Jack Kerr, Dominion 
Virginia Power; Raymond K. Kershaw, International Trans¬ 
mission Company; and Ed Schwerdt, Northeast Power Coor¬ 
dinating Council. 

44 NERC Standards at www.nerc.com (Urgent Action Stan¬ 
dard 1200, Cyber Security, Reliability Standard 1300, Cyber 
Security) and Joint DOE/PCIB standards guidance at www. 


ea.doe.gov/pdfs/21stepsbooklet.pdf (“21 Steps to Improve 
Cyber Security of SCADA Networks”). 

45 For example: “21 Steps to Improve Cyber Security of 

SCADA Networks,” http://www.ea.doe.gov/pdfs/ 

21 stepsbooklet.pdf. 

46 Canadian reference: http://www.dfait-maeci.gc.ca/ 

anti-terrorism/actionplan-en.asp; U.S. reference: http://www. 
whitehouse.gov/news/releases/2001/12/20011212-6.html. 

47 A “black box” technology is any device, sometimes highly 
important, whose workings are not understood by or accessi¬ 
ble to its user. 

48 DOE Form 417 is an example of an existing, but 
underutilized, private/public sector information sharing 
mechanism. 
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Appendix A 


Members of the U.S.-Canada Power System Outage 
Task Force and Its Three Working Groups 


Task Force Co-Chairs 

Spencer Abraham, Secretary of the U.S. Depart¬ 
ment of Energy (USDOE) 

R. John Efford, Canadian Minister of Natural 
Resources (current) and Herb Dhaliwal (August- 
December 2003) 

Canadian Task Force Members 

Linda J. Keen, President and CEO of the Cana¬ 
dian Nuclear Safety Commission 

Anne McLellan, Deputy Prime Minister and Min¬ 
ister of Public Safety and Emergency 
Preparedness 

John Manley, (previous) Deputy Prime Minister 
and Minister of Finance 

Kenneth Vollman, Chairman of the National 
Energy Board 

U.S. Task Force Members 

Nils J. Diaz, Chairman of the Nuclear Regulatory 
Commission 

Tom Ridge, Secretary of the U.S. Department of 
Homeland Security (DHS) 

Pat Wood, III, Chairman of the Federal Energy 
Regulatory Commission (FERC) 

Principals Managing the Working 
Groups 

Jimmy Glotfelty, Director, Office of Electric 
Transmission and Distribution, USDOE 

Dr. Nawal Kamel, Special Advisor to the Deputy 
Minister of Natural Resources Canada (NRCan) 


Working Groups 

Electric System Working Group 

CozChairs 

David Meyer, Senior Advisor, Office of Electric 
Transmission and Distribution, USDOE (U.S. 
Government) 


Thomas Rusnov, Senior Advisor, Natural 
Resources Canada (Government of Canada) 

Alison Silverstein, Senior Energy Policy Advisor 
to the Chairman, FERC (U.S. Government) 

Cgnadian Members 

David Barrie, Senior Vice President, Asset Man¬ 
agement, Hydro One 

David Burpee, Director, Renewable and Electri¬ 
cal Energy Division, NRCan (Government of 
Canada) 

David McFadden, Chair, National Energy and 
Infrastructure Industry Group, Gowling, Lafleur, 
Henderson LLP (Ontario) 

U.S. Members 

Donald Downes, Public Utility Commission 
Chairman (Connecticut) 

Joseph H. Eto, Staff Scientist, Ernest Orlando 
Lawrence Berkeley National Laboratory, Consor¬ 
tium for Electric Reliability Technology Solu¬ 
tions (CERTS) 

Jeanne M. Fox, President, New Jersey Board of 
Pubic Utilities (New Jersey) 

H. Kenneth Haase, Sr. Vice President, Transmis¬ 
sion, New York Power Authority (New York) 

J. Peter Lark, Chairman, Public Service Commis¬ 
sion (Michigan) 

Blaine Loper, Senior Engineer, Pennsylvania 
Public Utility Commission (Pennsylvania) 

William McCarty, Chairman, Indiana Utility 
Regulatory Commission (Indiana) 

David O’Brien, Vermont Public Service Depart¬ 
ment, Commissioner (Vermont) 

David O’Connor, Commissioner, Division of 
Energy Resources, Office of Consumer Affairs 
and Business Regulation (Massachusetts) 

Alan Schriber, Public Utility Commission Chair¬ 
man (Ohio) 

Gene Whitney, Policy Analyst, Office of Science 
and Technology Policy (U.S. Government) 
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Security Working Group 

CozChairs 

William J.S. Elliott, Assistant Secretary to the 
Cabinet, Security and Intelligence, Privy Council 
Office (Government of Canada) 

Robert Liscouski, Assistant Secretary for Infra¬ 
structure, Department of Homeland Security 
(U.S. Government) 

Canadian Membei's 

Curt Allen, Director Corporate Security, Manage¬ 
ment Board Secretariat, Office of the Corporate 
Chief Information Officer, Government of 
Ontario 

Gary Anderson, Chief, Counter-Intelligence- 
Global, Canadian Security Intelligence Service 
(Government of Canada) 

Michael Devancy, Deputy Chief, Information 
Technology Security, Communications Security 
Establishment (Government of Canada) 

James Harlick, Assistant Deputy Minister, Public 
Safety and Emergency Preparedness Canada 
(Government of Canada) 

Peter MacAulay, Officer in Charge of Technolog¬ 
ical Crime Branch, Royal Canadian Mounted 
Police (Government of Canada) 

Ralph Mahar, Chief, Technical Operations, Sci¬ 
entific and Technical Services, Canadian Secu¬ 
rity Intelligence Service (Government of Canada) 

Dr. James Young, Commissioner of Public Secu¬ 
rity, Ontario Ministry of Public Safety and Secu¬ 
rity (Ontario) 

U.S. Member 

Sid Casperson, Director, Office of Counter Ter¬ 
rorism (New Jersey) 

Vincent DeRosa, Deputy Commissioner, Director 
of Homeland Security, Department of Public 
Safety (Connecticut) 

Harold M. Hendershot, Acting Section Chief, 
Computer Intrusion Section, Federal Bureau of 
Investigation (U.S. Government) 

Kevin Kolevar, Chief of Staff to the Deputy Sec¬ 
retary of Energy, Department of Energy (U.S. 
Government) 

Paul Kurtz, Special Assistant to the President 
and Senior Director for Critical Infrastructure 


Protection, Homeland Security Council (U.S. 
Government) 

James McMahon, Senior Advisor (New York) 

Colonel Michael C. McDaniel, Assistant Adju¬ 
tant General for Homeland Security (Michigan) 

John Overly, Executive Director, Division of 
Homeland Security (Ohio) 

Andy Purdy, Deputy Director, National Cyber 
Security Division, Information Analysis and 
Infrastructure Protection Directorate, DHS 

Kerry L. Sleeper, Commissioner, Public Safety 
(Vermont) 

Arthur Stephens, Deputy Secretary for Informa¬ 
tion Technology, Office of Administration 
(Pennsylvania) 

Steve Schmidt, Section Chief, Special Technol¬ 
ogies and Applications, FBI 

Richard Swensen, Under Secretary, Office of 
Public Safety and Homeland Security 
(Massachusetts) 

Simon Szykman, Senior Policy Analyst, Office of 
Science and Technology Policy (U.S. 
Government) 


Nuclear Working Group 

CozChairs 

Nils Diaz, Chairman, Nuclear Regulatory Com¬ 
mission (U.S. Government) 

Linda J. Keen, President and Chief Executive 
Officer, Canadian Nuclear Safety Commission 
(Government of Canada) 

Canadian Members 

James Blyth, Director General, Directorate of 
Power Regulation, Canadian Nuclear Safety 
Commission (Government of Canada) 

Duncan Hawthorne, Chief Executive Officer, 
Bruce Power (Ontario) 

Robert Morrison, Senior Advisor to the Deputy 
Minister, Natural Resources Canada (Govern¬ 
ment of Canada) 

Ken Pereira, Vice President, Operations Branch, 
Canadian Nuclear Safety Commission (Govern¬ 
ment of Canada) 
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U.S. Member 

David J. Allard, CHP, Director, Bureau of Radia¬ 
tion Protection Department of Environmental 
Protection (Pennsylvania) 

Frederick F. Butler, Commissioner, New Jersey 
Board of Public Utilities (New Jersey) 

Sam J. Collins, Deputy Executive Director for 
Reactor Programs, Nuclear Regulatory 
Commission 

Paul Eddy, Power Systems Operations Specialist, 
Public Service Commission (New York) 

J. Peter Lark, Chairman, Public Service Commis¬ 
sion (Michigan) 

William D. Magwood IV, Director, Office of 
Nuclear Energy, Science and Technology, 
Department of Energy (U.S. Government) 


Dr. G. Ivan Moldonado, Associate Professor, 
Mechanical, Industrial and Nuclear Engineering; 
University of Cincinnati (Ohio) 

David O’Brien, Commissioner, Department of 
Public Service (Vermont) 

David O’Connor, Commissioner, Division of 
Energy Resources, Office of Consumer Affairs 
and Business Regulation (Massachusetts) 

Gene Whitney, Policy Analyst, National Science 
and Technology Policy, Executive Office of the 
President (U.S. Government) 

Edward Wilds, Bureau of Air Management, 
Department of Environmental Protection 
(Connecticut) 


This report reflects tireless efforts by hundreds of individuals not identified by name above. They include 
electrical engineers, information technology experts, and other specialists from across the North American 
electricity industry, the academic world, regulatory agencies in the U.S. and Canada, the U.S. Department of 
Energy and its national laboratories, the U.S. Department of Homeland Security, the U.S. Federal Bureau of 
Investigation, Natural Resources Canada, the Royal Canadian Mounted Police, the Bonneville Power 
Administration, the Western Area Power Administration, the Tennessee Valley Authority, the North Ameri¬ 
can Electric Reliability Council, PJM Interconnection, Inc., Ontario’s Independent Market Operator, and 
many other organizations. The members of the U.S.-Canada Power System Outage Task Force thank these 
individuals, and congratulate them for their dedication and professionalism. 
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Appendix B 


Description of Outage Investigation and 
Process for Development of Recommendations 


On August 14, 2003, the northeastern U.S. and 
Ontario, Canada, suffered one of the largest power 
blackouts in the history of North America. The 
area affected extended from New York, Massachu¬ 
setts, and New Jersey west to Michigan, and from 
Ohio north to Ontario, Canada. 

President George W. Bush and Prime Minister 
Jean Chretien created a U.S.-Canada Task Force to 
identify the causes of the power outage and to 
develop recommendations to prevent and contain 
future outages. U.S. Energy Secretary Spencer 
Abraham and Minister of Natural Resources Can¬ 
ada Herb Dhaliwal, meeting in Detroit, Michigan, 
on August 20, agreed on an outline for the activi¬ 
ties of the Task Force. 

This appendix outlines the process used for the 
determination of why the blackout occurred and 
was not contained and explains how recommen¬ 
dations were developed to prevent and minimize 
the scope of future outages. Phase I of the process 
was completed when the Interim Report, identify¬ 
ing what happened and why, was released on 
November 19, 2003. This Final Report, released on 
April 5, 2004, completes Phase II of the process by 
providing recommendations acceptable to both 
countries for preventing and reducing the scope of 
future blackouts. This report, which encompasses 
both the findings of the Interim Report and 
updated information from continued analysis by 
the investigative teams, totally supersedes the 
Interim Report. 

During Phase II, the Task Force sought the views 
of the public and expert stakeholders in Canada 
and the U.S. towards the development of the final 
recommendations. People were asked to comment 
on the Interim Report and provide their views on 
recommendations to enhance the reliability of the 
electric system in each country. The Task Force 
collected this information by several methods, 
including public forums, workshops of technical 
experts, and electronic submissions to the NRCan 
and DOE web sites. 

Verbatim transcripts of the forums and workshops 
were provided on-line, on both the NRCan and 
DOE web sites. In Canada, which operates in both 
English and French, comments were posted in the 


language in which they were submitted. Individ¬ 
uals who either commented on the Interim Report, 
provided suggestions for recommendations to 
improve reliability, or both are listed in Appendix 
C. Their input was greatly appreciated. Their 
comments can be viewed in full or in summary 
at http://www.nrcan.gc.ca or at http://www. 
electricity. doe .gov. 

Task Force Composition and 
Responsibilities 

The co-chairs of the Task Force were U.S. Secre¬ 
tary of Energy Spencer Abraham and Minister of 
Natural Resources Canada (NRCan) Herb 
Dhaliwal for Phase I and Minister of NRCan R. 
John Efford for Phase II. Other U.S. members were 
Nils J. Diaz, Chairman of the Nuclear Regulatory 
Commission, Tom Ridge, Secretary of Homeland 
Security, and Pat Wood III, Chairman of the Fed¬ 
eral Energy Regulatory Commission. The other 
Canadian members were Deputy Prime Minister 
John Manley during Phase I and Anne McLellan, 
Deputy Prime Minister and Minister of Public 
Safety and Emergency Preparedness during Phase 
II, Linda J. Keen, President and CEO of the Cana¬ 
dian Nuclear Safety Commission, and Kenneth 
Vollman, Chairman of the National Energy Board. 
The coordinators for the Task Force were Jimmy 
Glotfelty on behalf of the U.S. Department of 
Energy and Dr. Nawal Kamel on behalf of Natural 
Resources Canada. 

On August 27, 2003, Secretary Abraham and Min¬ 
ister Dhaliwal announced the formation of three 
Working Groups to support the work of the Task 
Force. The three Working Groups addressed elec¬ 
tric system issues, security matters, and questions 
related to the performance of nuclear power plants 
over the course of the outage. The members of the 
Working Groups were officials from relevant fed¬ 
eral departments and agencies, technical experts, 
and senior representatives from the affected states 
and the Province of Ontario. 

U.S.-Canada-NERC Investigation Team 

Under the oversight of the Task Force, three inves¬ 
tigative teams of electric system, nuclear and 
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cyber and security experts were established to 
investigate the causes of the outage. The electric 
system investigative team was comprised of indi¬ 
viduals from several U.S. federal agencies, the 
U.S. Department of Energy’s national laboratories, 
Canadian electric industry, Canada’s National 
Energy Board, staff from the North American Elec¬ 
tric Reliability Council (NERC), and the U.S. elec¬ 
tricity industry. The overall investigative team 
was divided into several analytic groups with spe¬ 
cific responsibilities, including data management, 
determining the sequence of outage events, sys¬ 
tem modeling, evaluation of operating tools and 
communications, transmission system perfor¬ 
mance, generator performance, NERC and regula¬ 
tory standards/procedures and compliance, 
system planning and design studies, vegetation 
and right-of-way management, transmission and 
reliability investments, and root cause analysis. 

Additional teams of experts were established to 
address issues related to the performance of 
nuclear power plants affected by the outage, and 
physical and cyber security issues related to the 
bulk power infrastructure. The security and 
nuclear investigative teams also had liaisons who 
worked closely with the various electric system 
investigative teams mentioned above. 

Function of the Working Groups 

The U.S. and Canadian co-chairs of each of the 
three Working Groups (i.e., an Electric System 
Working Group, a Nuclear Working Group, and a 
Security Working Group) designed investigative 
assignments to be completed by the investigative 
teams. These findings were synthesized into a sin¬ 
gle Interim Report reflecting the conclusions of 
the three investigative teams and the Working 
Groups. For Phase II, the Interim Report was 
enhanced with new information gathered from the 
technical conferences, additional modeling and 
analysis and public comments. Determination of 
when the Interim and Final Reports were com¬ 
plete and appropriate for release to the public was 
the responsibility of the U.S.-Canada Task Force 
and the investigation co-chairs. 

Confidentiality of Data and Information 

Given the seriousness of the blackout and the 
importance of averting or minimizing future 
blackouts, it was essential that the Task Force’s 
teams have access to pertinent records and data 
from the regional transmission operators (RTOs) 
and independent system operators (ISOs) and 


electric companies affected by the blackout, and 
data from the nuclear and security associated enti¬ 
ties. The investigative teams also interviewed 
appropriate individuals to learn what they saw 
and knew at key points in the evolution of the out¬ 
age, what actions they took, and with what pur¬ 
pose. In recognition of the sensitivity of this 
information, Working Group members and mem¬ 
bers of the teams signed agreements affirming that 
they would maintain the confidentiality of data 
and information provided to them, and refrain 
from independent or premature statements to the 
media or the public about the activities, findings, 
or conclusions of the individual Working Groups 
or the Task Force as a whole. 

After publication of the Interim Report, the Task 
Force investigative teams continued to evaluate 
the data collected during Phase I. Continuing with 
Phase I criteria, confidentiality was maintained in 
Phase II, and all investigators and working group 
members were asked to refrain from independent 
or premature statements to the media or the public 
about the activities, findings, or conclusions of the 
individual Working Groups or the Task Force as a 
whole. 

Relevant U.S. and Canadian Legal 
Framework 

United States 

The Secretary of Energy directed the Department 
of Energy (DOE) to gather information and con¬ 
duct an investigation to examine the cause or 
causes of the August 14, 2003 blackout. In initiat¬ 
ing this effort, the Secretary exercised his author¬ 
ity under section 11 of the Energy Supply and 
Environmental Coordination Act of 1974, and sec¬ 
tion 13 of the Federal Energy Administration Act 
of 1974, to gather energy-related information and 
conduct investigations. This authority gives him 
and the DOE the ability to collect such energy 
information as he deems necessary to assist in the 
formulation of energy policy, to conduct investiga¬ 
tions at reasonable times and in a reasonable man¬ 
ner, and to conduct physical inspections at energy 
facilities and business premises. In addition, DOE 
can inventory and sample any stock of fuels or 
energy sources therein, inspect and copy records, 
reports, and documents from which energy infor¬ 
mation has been or is being compiled and to ques¬ 
tion such persons as it deems necessary. 
DOE worked closely with Natural Resources Can¬ 
ada and NERC on the investigation. 
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Canada 

Minister Dhaliwal, as the Minister responsible for 
Natural Resources Canada, was appointed by 
Prime Minister Chretien as the Canadian Co-Chair 
of the Task Force. Minister Dhaliwal worked 
closely with his American Co-Chair, Secretary of 
Energy Abraham, as well as NERC and his provin¬ 
cial counterparts in carrying out his responsibili¬ 
ties. When NRCan Minister R. John Efford 
assumed his role as the new Canadian Co-Chair, 
he continued to work closely with Secretary Abra¬ 
ham and the three Working Groups. 

Under Canadian law, the Task Force was charac¬ 
terized as a non-statutory, advisory body that does 
not have independent legal personality. The Task 
Force did not have any power to compel evidence 
or witnesses, nor was it able to conduct searches 
or seizures. In Canada, the Task Force relied on 
voluntary disclosure for obtaining information 
pertinent to its work. 

Oversight and Coordination 

The Task Force’s U.S. and Canadian coordinators 
held frequent conference calls to ensure that all 
components of the investigation were making 
timely progress. They briefed both Secretary Abra¬ 
ham and Minister R. John Efford (Minister 
Dhaliwal, Phase I) regularly and provided weekly 
summaries from all components on the progress of 
the investigation. During part of Phase I, the lead¬ 
ership of the electric system investigation team 
held daily conference calls to address analytical 
and process issues important to the investigation. 
The three Working Groups held weekly confer¬ 
ence calls to enable the investigation teams to 
update the Working Group members on the state 
of the overall analysis. Conference calls also 
focused on the analysis updates and the need to 
ensure public availability of all inputs to the 
development of recommendations. Working 
Group members attended panels and face-to-face 
meetings to review drafts of the report. 

Electric System Investigation Phase I 
Investigative Process 

Collection of Data and Inf ormation from ISOs, 
Utilities, States, and the Province of Ontario 

On Tuesday, August 19, 2003, investigators affili¬ 
ated with the U.S. Department of Energy (DOE) 
began interviewing control room operators and 
other key officials at the ISOs and the companies 
most directly involved with the initial stages of the 
outage. In addition to the information gained in 


the interviews, the interviewers sought informa¬ 
tion and data about control room operations and 
practices, the organization’s system status and 
conditions on August 14, the organization’s oper¬ 
ating procedures and guidelines, load limits on its 
system, emergency planning and procedures, sys¬ 
tem security analysis tools and procedures, and 
practices for voltage and frequency monitoring. 
Similar interviews were held later with staff at 
Ontario’s Independent Electricity Market Opera¬ 
tor (IMO) and Hydro One in Canada. 

On August 22 and 26, NERC directed the reliabil¬ 
ity coordinators at the ISOs to obtain a wide range 
of data and information from the control area coor¬ 
dinators under their oversight. The data requested 
included System Control and Data Acquisition 
(SCADA) logs, Energy Management System (EMS) 
logs, alarm logs, data from local digital fault 
recorders, data on transmission line and generator 
“trips” (i.e., automatic disconnection to prevent 
physical damage to equipment), state estimator 
data, operator logs and transcripts, and informa¬ 
tion related to the operation of capacitors, phase 
shifting transformers, load shedding, static var 
compensators, special protection schemes or sta¬ 
bility controls, and high-voltage direct current 
(HVDC) facilities. NERC issued another data 
request to FirstEnergy on September 15 for copies 
of studies since 1990 addressing voltage support, 
reactive power supply, static capacitor applica¬ 
tions, voltage requirements, import or transfer 
capabilities (in relation to reactive capability or 
voltage levels), and system impacts associated 
with unavailability of the Davis-Besse plant. All 
parties were instructed that data and information 
provided to either DOE or NERC did not have to be 
submitted a second time to the other entity—all 
material provided would go into a common data 
base. 

For the Interim Report the investigative team held 
three technical conferences (August 22, Septem¬ 
ber 8-9, and October 1-3) with the RTOs and ISOs 
and key utilities aimed at clarifying the data 
received, filling remaining gaps in the data, and 
developing a shared understanding of the data’s 
implications. 

Data “Warehouse” 

The data collected by the investigative team was 
organized in an electronic repository containing 
thousands of transcripts, graphs, generator and 
transmission data and reports at the NERC head¬ 
quarters in Princeton, New Jersey. The warehouse 
contains more than 20 gigabytes of information, in 
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more than 10,000 files. This established a set of 
validated databases that the analytic teams could 
access as needed. 

Individual investigative teams conducted their 
activities through a number of in-person meetings 
as well as conference calls and e-mail communica¬ 
tions over the months of the investigation. 
Detailed investigative team findings will be 
included in upcoming technical reports issued by 
NERC. 

The following were the information sources for 
the Electric System Investigation: 

♦ Interviews conducted by members of the 
U.S.-Canada Electric Power System Outage 
Investigation Team with personnel at all of the 
utilities, control areas and reliability coordina¬ 
tors in the weeks following the blackout. 

♦ Three fact-gathering meetings conducted by the 
Investigation Team with personnel from the 
above organizations on August 22, September 8 
and 9, and October 1 to 3, 2003. 

♦ Three public hearings held in Cleveland, Ohio; 
New York City, New York; and Toronto, 
Ontario. 

♦ Two technical conferences held in Philadel¬ 
phia, Pennsylvania, and Toronto, Canada. 

♦ Materials provided by the above organizations 
in response to one or more data requests from 
the Investigation Team. 

♦ All taped phone transcripts between involved 
operations centers. 

♦ Additional interviews and field visits with oper¬ 
ating personnel on specific issues in October 
2003 and January 2004. 

♦ Field visits to examine transmission lines and 
vegetation at short-circuit locations. 

♦ Materials provided by utilities and state regula¬ 
tors in response to data requests on vegetation 
management issues. 

♦ Detailed examination of thousands of individ¬ 
ual relay trips for transmission and generation 
events. 

Data Exploration and Requirements 

This group requested data from the following con¬ 
trol areas and their immediate neighbors: MISO, 
MECS, FE, PJM, NYISO, ISO-NE, and IMO. The 
data and exploration and requirements group’s 


objective was to identify industry procedures that 
are in place today for collecting information fol¬ 
lowing large-scale transmission related power out¬ 
ages and to assess those procedures in terms of the 
August 14, 2003 power outage investigation. 

They sought to: 

♦ Determine what happened in terms of immedi¬ 
ate causes, sequence of events, and resulting 
consequences; 

♦ Understand the failure mechanism via record¬ 
ings of system variables such as frequency, volt¬ 
ages, and flows; 

♦ Enable disturbance re-creation using computer 
models for the purposes of understanding the 
mechanism of failure, identifying ways to avoid 
or mitigate future failures, and assessing and 
improving the integrity of computer models; 

♦ Identify deeper, underlying factors contributing 
to the failure (e.g., general policies, standard 
practices, communication paths, organizational 
cultures). 

Sequence of Events 

More than 800 events occurred during the black¬ 
out of August 14. The events included the opening 
and closing of transmission lines and associated 
breakers and switches, the opening of transform¬ 
ers and associated breakers, and the tripping and 
starting of generators and associated breakers. 
Most of these events occurred in the few minutes 
of the blackout cascade between 16:06 and 16:12 
EDT. To properly analyze a blackout of this mag¬ 
nitude, an accurate knowledge of the sequence of 
events must be obtained before any analysis of the 
blackout can be performed. 

Establishing a precise and accurate sequence of 
outage-related events was a critical building block 
for the other parts of the investigation. One of the 
key problems in developing this sequence was 
that although much of the data pertinent to an 
event was time-stamped, there was variation from 
source to source in how the time-stamping was 
done, and not all of the time-stamps were synchro¬ 
nized to the National Institute of Standards and 
Technology (NIST) standard clock in Boulder, CO. 
Validating the timing of specific events became a 
large, important, and sometimes difficult task. 
This work was also critical to the issuance by the 
Task Force on September 12 of a “timeline” for the 
outage. The timeline briefly described the princi¬ 
pal events, in sequence, leading up to the initia¬ 
tion of the outage’s cascade phase, and then in the 
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cascade itself. The timeline was not intended, 
however, to address the causal relationships 
among the events described, or to assign fault or 
responsibility for the blackout. All times in the 
chronology are in Eastern Daylight Time. 

System Modeling and Simulation Analysis 

The system modeling and simulation team 
(SMST) replicated system conditions on August 
14 and the events leading up to the blackout. The 
modeling reflects the state of the electric system. 
Once benchmarked to actual conditions at 
selected critical times on August 14, it allowed 
analysts to conduct a series of sensitivity studies 
to determine if the system was stable and within 
limits at each point in time leading up to the cas¬ 
cade. The analysis also confirmed when the sys¬ 
tem became unstable and allowed analysts to test 
whether measures such as load-shedding would 
have prevented the cascade. 

This team consisted of a number of NERC staff and 
persons with expertise in areas necessary to read 
and interpret all of the data logs, digital fault 
recorder information, sequence of events record¬ 
ers information, etc. The team consisted of about 
40 people involved at various different times with 
additional experts from the affected areas to 
understand the data. 

Overall, this team: 

♦ Created steady-state power flow cases for 
observed August 14 system conditions starting 
at 15:00 EDT through about 16:05 EDT (when 
powerflow simulations were no longer ade¬ 
quate), about the time of the Sammis-Star 
345-kV outage. 

♦ Compiled relevant data for dynamic modeling 
of affected systems (e.g. generator dynamic 
models, load characteristics, special protection 
schemes, etc.). 

♦ Performed rigorous contingency analysis (over 
800 contingencies in Eastern Interconnection 
run) to determine if the system was within oper¬ 
ating within thermal and voltage limits, and 
within limits for possible further contingencies 
(N-l contingencies) prior to and during the ini¬ 
tial events of the blackout sequence. 

♦ Performed sensitivity analysis to determine the 
significance of pre-existing conditions such as 
transmission outages in Cinergy and Dayton, 
and the earlier loss of Eastlake unit 5 
generation. 

♦ Performed “what-if” analysis to determine 
potential impacts of remedial actions such as 


reclosing of outages facilities during the 
sequence of events, load shedding, generation 
redispatch, and combinations of load shedding 
and redispatch. 

♦ Compared transaction tags for August 14, to 
show how they matched up with those of other 
days in 2003 and 2002. 

♦ Analyzed the transactions and generation dis¬ 
patch changes used to bring replacement power 
for the loss of Eastlake 5 generation into 
FirstEnergy, to determine where the replace¬ 
ment power came from. 

♦ Analyzed the performance of the Interchange 
Distribution Calculator (IDC) and its potential 
capability to help mitigate the overloads. 

The SMST began its efforts using the base case 
data and model provided by FirstEnergy as its 
foundation. 

The modeling and system studies work was per¬ 
formed under the guidance of a specially formed 
MAAC-ECAR-NPCC (MEN) Coordinating Group, 
consisting of the Regional Managers from those 
three regions impacted by the blackout, and their 
respective regional chairmen or designees. 

Assessment of Operations Tools, SCADA/EMS, 
Communications, and Operations Planning 

The Operations Tools, SCADA/EMS, Communica¬ 
tions, and Operations Planning Team assessed the 
observability of the electric system to operators 
and reliability coordinators, and the availability 
and effectiveness of operational (real-time and 
day-ahead) reliability assessment tools, including 
redundancy of views and the ability to observe the 
“big picture” regarding bulk electric system condi¬ 
tions. The team investigated operating practices 
and effectiveness of operating entities and reliabil¬ 
ity coordinators in the affected area. This team 
investigated all aspects of the blackout related to 
operator and reliability coordinator knowledge of 
system conditions, action or inactions, and 
communications. 

The Operations and Tools team conducted exten¬ 
sive interviews with operating personnel at 
the affected facilities. They participated in the 
technical investigation meetings with affected 
operators in August, September and October and 
reviewed the August 14 control room transcripts 
in detail. This group investigated the performance 
of the MISO and FirstEnergy EMS hardware and 
software and its impact on the blackout, and 
looked at operator training (including the use 
of formal versus “on-the-job” training) and the 
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communications and interactions between the 
operations and information technology support 
staff at both organizations. 

Frequency/ACE Analysis 

The Frequency/ACE Team analyzed potential fre¬ 
quency anomalies that may have occurred on 
August 14, as compared to typical interconnection 
operations. The team also determined whether 
there were any unusual issues with control perfor¬ 
mance and frequency and any effects they may 
have had related to the cascading failure, and 
whether frequency-related anomalies were con¬ 
tributing factors or symptoms of other problems 
leading to the cascade. 

Assessment of Transmission System 
Performance, Protection, Control, 

Maintenance, and Damage 

This team investigated the causes of all transmis¬ 
sion facility automatic operations (trips and 
reclosings) leading up to and through to the end of 
the cascade on all facilities greater than 100 kV. 
Included in the review were relay protection and 
remedial action schemes, including under¬ 
frequency load-shedding and identification of the 
cause of each operation and any misoperations 
that may have occurred. The team also assessed 
transmission facility maintenance practices in the 
affected area as compared to good utility practice 
and identified any transmission equipment that 
was damaged as a result of the cascading outage. 
The team reported patterns and conclusions 
regarding what caused transmission facilities to 
trip; why did the cascade extend as far as it did 
and not further into other systems; any 
misoperations and the effect those misoperations 
had on the outage; and any transmission equip¬ 
ment damage. Also the team reported on the trans¬ 
mission facility maintenance practices of entities 
in the affected area compared to good utility 
practice. 

Assessment of Generator Performance, 
Protection, Controls, Maintenance, and 
Damage 

This team investigated the cause of generator trips 
for all generators with a 10 MW or greater name¬ 
plate rating leading to and through the end of the 
cascade. The review included the cause for the 
generator trips, relay targets, unit power runbacks, 
and voltage/reactive power excursions. The team 
reported any generator equipment that was dam¬ 
aged as a result of the cascading outage. The team 


reported on patterns and conclusions regarding 
what caused generation facilities to trip. The team 
identified any unexpected performance anomalies 
or unexplained events. The team assessed genera¬ 
tor maintenance practices in the affected area as 
compared to good utility practice. The team ana¬ 
lyzed the coordination of generator under¬ 
frequency settings with transmission settings, 
such as under-frequency load shedding. The team 
gathered and analyzed data on affected nuclear 
units and worked with the Nuclear Regulatory 
Commission to address U.S. nuclear unit issues. 

The Generator Performance team sent out an 
extensive data request to generator owners during 
Phase I of the investigation, but did not receive the 
bulk of the responses until Phase II. The analysis 
in this report uses the time of generator trip as it 
was reported by the plant owner, or the time when 
the generator ceased feeding power into the grid as 
determined by a system monitoring device, and 
synchronized those times to other known grid 
events as best as possible. However, many genera¬ 
tion owners offered little information on the cause 
of unit trips or key information on conditions at 
their units, so it may never be possible to fully 
determine what happened to all the generators 
affected by the blackout, and why they performed 
as they did. In particular, it is not clear what point 
in time each reported generator trip time reflects— 
i.e., when in the cycle between when the generator 
first detected the condition which caused it to trip, 
or several seconds later when it actually stopped 
feeding power into the grid. This lack of clear data 
hampered effective investigation of generator 
issues. 

Vegetation Management 

For Phase I the Vegetation/Right of Way Team con¬ 
ducted a field investigation into the contacts that 
occurred between trees and conductors on August 
14 within the FirstEnergy, Dayton Power & Light 
and Cinergy service areas. The team also exam¬ 
ined detailed information gained from data 
requests to these and other utilities, including his¬ 
torical outages from tree contacts on these lines. 
These findings were included in the Interim 
Report and detailed in an interim report on utility 
vegetation management, posted at http://www. 
ferc.gov/cust-protect/moi/uvm-initial-report.pdf. 

The team also requested information from the 
public utility commissions in the blackout area on 
any state requirements for transmission vegeta¬ 
tion management and right-of-way maintenance. 
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Beginning in Phase I and continuing into Phase II, 
the Vegetation/ROW team looked in detail at the 
vegetation management and ROW maintenance 
practices for the three utilities above, and com¬ 
pared them to accepted utility practices across 
North America. Issues examined included ROW 
legal clearance agreements with landowners, bud¬ 
gets, tree-trimming cycles, organization structure, 
and use of herbicides. Through CN Utility Con¬ 
sulting, the firm hired by FERC to support the 
blackout investigation, the Vegetation/ROW team 
also identified “best practices” for transmission 
ROW management. They used those practices to 
evaluate the performance of the three utilities 
involved in August 14 line outages and also to 
evaluate the effectiveness of utility vegetation 
management practices generally. 

On March 2, 2004, FERC released CN Utility Con¬ 
sulting’s “Utility Vegetation Management Final 
Report” (see http://www.ferc.gov/cust-protect/ 
moi/uvm-final-report.pdf). 

Root Cause Analysis 

The investigation team used a technique called 
root cause analysis to help guide the overall inves¬ 
tigation process in an effort to identify root causes 
and contributing factors leading to the start of the 
blackout in Ohio. The root cause analysis team 
worked closely with the technical investigation 
teams providing feedback and queries on addi¬ 
tional information. Also, drawing on other data 
sources as needed, the root cause analysis verified 
facts regarding conditions and actions (or inac¬ 
tions) that contributed to the blackout. 

Root cause analysis is a systematic approach to 
identifying and validating causal linkages among 
conditions, events, and actions (or inactions) lead¬ 
ing up to a major event of interest—in this case the 
August 14 blackout. It has been successfully 
applied in investigations of events such as nuclear 
power plant incidents, airplane crashes, and the 
recent Columbia space shuttle disaster. 

Root cause analysis is driven by facts and logic. 
Events and conditions that may have helped to 
cause the major event in question are described in 
factual terms, and causal linkages are established 
between the major event and earlier conditions or 
events. Such earlier conditions or events are 
examined in turn to determine their causes, and at 
each stage the investigators ask whether the par¬ 
ticular condition or event could have developed or 
occurred if a proposed cause (or combination of 
causes) had not been present. If the particular 


event being considered could have occurred with¬ 
out the proposed cause (or combination of causes), 
the proposed cause or combination of causes is 
dropped from consideration and other possibili¬ 
ties are considered. 

Root cause analysis typically identifies several or 
even many causes of complex events; each of the 
various branches of the analysis is pursued until 
either a “root cause” is found or a non-correctable 
condition is identified. (A condition might be con¬ 
sidered as non-correctable due to existing law, 
fundamental policy, laws of physics, etc.). Some¬ 
times a key event in a causal chain leading to the 
major event could have been prevented by timely 
action by one or another party; if such action was 
feasible, and if the party had a responsibility to 
take such action, the failure to do so becomes a 
root cause of the major event. 

Phase II 

On December 12, 2003, Paul Martin was elected as 
the new Prime Minister of Canada and assumed 
responsibility for the Canadian section of the 
Power System Outage Task Force. Prime Minister 
Martin appointed R. John Efford as the new Minis¬ 
ter of Natural Resources Canada and co-chair of 
the Task Force. 

Press releases, a U.S. Federal Register notice, and 
ads in the Canadian press notified the public and 
stakeholders of Task Force developments. All 
public statements were released to the media and 
are available on the OETD and the NRCan web 
sites. 

Several of the investigative teams began their 
work during Phase I and completed it during 
Phase II. Other teams could not begin their investi¬ 
gation into the events related to the cascade and 
blackout, beginning at 16:05:57 EDT on August 
14, 2003, until analysis of the Ohio events before 
that point was completed in Phase I. 

System Planning, Design and Studies Team 

The SPDST studied reactive power management, 
transactions scheduling, system studies and sys¬ 
tem operating limits for the Ohio and ECAR areas. 
In addition to the data in the investigation data 
warehouse, the team submitted six comprehen¬ 
sive data requests to six control areas and reliabil¬ 
ity coordinators, including FirstEnergy, to build 
the foundation for its analyses. The team exam¬ 
ined reactive power and voltage management poli¬ 
cies, practices and criteria and compared them to 
actual and modeled system conditions in the 
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affected area and neighboring systems. They 
assessed the process of assessing and approving 
transaction schedules and tags and the coordina¬ 
tion of those schedules and transactions in 
August, 2003, and looked at the impact of tagged 
transactions on key facilities on August 14. Simi¬ 
larly, the team examined system operating limits 
in effect for the affected area on August 14, how 
they had been determined, and whether they were 
appropriate to the grid as it existed in August 
2003. They reviewed system studies conducted by 
FirstEnergy and ECAR for 2003 and prior years, 
including the methodologies and assumptions 
used in those studies and how those were coordi¬ 
nated across adjoining control areas and councils. 
The SPDST also compared how the studied condi¬ 
tions compared to actual conditions on August 14. 
For all these matters, the team compared the poli¬ 
cies, studies and practices to good utility 
practices. 

The SPDST worked closely with the Modeling and 
System Simulation Team. They used data pro¬ 
vided by the control areas, RTOs and ISOs on 
actual system conditions across August 2003, and 
NERC Tag Dump and TagNet data. To do the volt¬ 
age analyses, the team started with the MSST’s 
base case data and model of the entire Eastern 
Interconnection, then used a more detailed model 
of the FE area provided by FirstEnergy. With these 
models they conducted extensive PV and VQ anal¬ 
yses for different load levels and contingency 
combinations in the Cleveland-Akron area, run¬ 
ning over 10,000 different power flow simula¬ 
tions. Team members have extensive experience 
and expertise in long-term and operational plan¬ 
ning and system modeling. 

NERC Standards, Procedures and Compliance 
Team 

The SP&C team was charged with reviewing the 
NERC Operating Policies and Planning Standards 
for any violations that occurred in the events lead¬ 
ing up to and during the blackout, and assessing 
the sufficiency or deficiency of NERC and regional 
reliability standards, policies and procedures. 
They were also directed to develop and conduct 
audits to assess compliance with the NERC and 
regional reliability standards as relevant to the 
cause of the outage. 

The team members, all experienced participants 
in the NERC compliance and auditing program, 
examined the findings of the Phase I investigation 
in detail, building particularly upon the root cause 


analysis. They looked independently into many 
issues, conducting additional interviews as 
needed. The team distinguished between those 
violations which could be clearly proven and 
those which were problematic but not fully prov¬ 
able. The SP&C team offered a number of conclu¬ 
sions and recommendations to improve 
operational reliability, NERC standards, the stan¬ 
dards development process and the compliance 
program. 

Dynamic Modeling of the Cascade 

This work was conducted as an outgrowth of the 
work done by the System Modeling and Simula¬ 
tion team in Phase I, by a team composed of the 
NPCC System Studies-38 Working Group on 
Inter-Area Dynamic Analysis, augmented by rep¬ 
resentatives from ECAR, MISO, PJM and SERC. 
Starting with the steady-state power flows devel¬ 
oped in Phase I, they moved the analysis forward 
across the Eastern Interconnection from 16:05:50 
EDT on in a series of first steady-state, then 
dynamic simulations to understand how condi¬ 
tions changed across the grid. 

This team is using the model to conduct a series of 
“what if’ analyses, to better understand what con¬ 
ditions contributed to the cascade and what might 
have happened if events had played out differ¬ 
ently. This work is described further within Chap¬ 
ter 6. 

Additional Cascade Analysis 

The core team for the cascade investigation drew 
upon the work of all the teams to understand the 
cascade after 16:05:57. The investigation’s official 
Sequence of Events was modified and corrected as 
appropriate as additional information came in 
from asset owners, and as modeling and other 
investigation revealed inaccuracies in the initial 
data reports. The team issued additional data 
requests and looked closely at the data collected 
across the period of the cascade. The team orga¬ 
nized the analysis by attempting to link the indi¬ 
vidual area and facility events to the power flows, 
voltages and frequency data recorded by Hydro 
One’s PSDRs (as seen in Figures 6.16 and 6.25) 
and similar data sets collected elsewhere. This 
effort improved the team’s understanding of the 
interrelationships between the interaction, timing 
and impacts of lines, loads and generation trips, 
which are now being confirmed by dynamic mod¬ 
eling. Graphing, mapping and other visualization 
tools also created insights into the cascade, as 
with the revelation of the role of zone 3 relays in 
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accelerating the early spread of the cascade within 
Ohio and Michigan. 

The team was aided in its work by the ability to 
learn from the studies and reports on the blackout 
completed by various groups outside the investi¬ 
gation, including those by the Public Utility Com¬ 
mission of Ohio, the Michigan Public Service 
Commission, the New York ISO, ECAR and the 
Public Service Commission of New York. 

Beyond the work of the Electric System investiga¬ 
tion, the Security and Nuclear investigation teams 
conducted additional analyses and updated their 
interim reports with the additional findings. 

Preparation of Task Force 
Recommendations 

Public and stakeholder input was an important 
component in the development of the Task Force’s 
recommendations. The input received covered a 
wide range of subjects, including enforcement of 
reliability standards, improving communications, 
planning for responses to emergency conditions, 
and the need to evaluate market structures. See 
Appendix C for a list of contributors. 

Three public forums and two technical confer¬ 
ences were held to receive public comments on 
the Interim Report and suggested recommenda¬ 
tions for consideration by the Task Force. These 
events were advertised by various means, includ¬ 
ing announcements in the Federal Register and the 
Canada Gazette, advertisements in local news¬ 
papers in the U.S., invitations to industry through 
NERC, invitations to the affected state and 
provincial regulatory bodies, and government 
press releases. All written inputs received at 
these meetings and conferences were posted for 


additional comment on public websites main¬ 
tained by the U.S. Department of Energy and Nat¬ 
ural Resources Canada (www.electricity.doe.gov 
and www.nrcan.gc.ca, respectively). The tran¬ 
scripts from the meetings and conferences were 
also posted on these websites. 

♦ Members of all three Working Groups partici¬ 
pated in public forums in Cleveland, Ohio 
(December 4, 2003), New York City (December 
5, 2003), and Toronto, Ontario (December 8, 
2003). 

♦ The ESWG held two technical conferences, in 
Philadelphia, Pennsylvania (December 16, 

2003) , and Toronto, Ontario (January 9, 2004). 

♦ The NWG also held a public meeting on 
nuclear-related issues pertaining to the black¬ 
out at the U.S. Nuclear Regulatory Commission 
headquarters in Rockville, Maryland (January 6, 

2004) . 

The electric system investigation team also devel¬ 
oped an extensive set of technical findings based 
on team analyses and cross-team discussions as 
the Phase I and Phase II work progressed. Many of 
these technical findings were reflected in NERC’s 
actions and initiatives of February 10, 2004. In 
turn, NERC’s actions and initiatives received sig¬ 
nificant attention in the development of the Task 
Force’s recommendations. 

The SWG convened in January 2004 in Ottawa to 
review the Interim Report. The SWG also held vir¬ 
tual meetings with the investigative team leads 
and working group members. 

Similarly, the ESWG conducted weekly telephone 
conferences and it held face-to-face meetings on 
January 30, March 3, and March 18, 2004. 
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Appendix C 

List of Commenters 


The individuals listed below either commented on the Interim Report, provided suggestions for recom¬ 
mendations to improve reliability, or both. Their input was greatly appreciated. Their comments can be 
viewed in full or in summary at http://www.nrcan.gc.ca or at http://www.electricity.doe.gov. 


Abbott, Richard E. 

Personal comment 

Adams, Tom 

Energy Probe 

Akerlund, John 

Uninterruptible Power Networks UPN AB 

Alexander, Anthony J. 

FirstEnergy 

Allen, Eric 

New York ISO 

Barrie, David 

Hydro One 

Benjamin, Don 

North American Electric Reliability Council (NERC) 

Besich, Tom 

Electric power engineer 

Blasiak, James L. 

DykemaGossett PLLC for International Transmission Company (ITC) 

Booth, Chris 

Experienced Consultants LLC 

Boschmann, Armin 

Manitoba Hydro 

Brown, Glenn W. 

New Brunswick Power Corp; NPCC Representative & Chairman, NERC Disturbance Analysis 
Working Group 

Burke, Thomas J. 

Orion Associates International, Inc. 

Burrell, Carl 

IMO Ontario 

Bush, Tim 

Consulting 

Calimano, Michael 

New York ISO 

Canizares, Claudio A. 

University of Waterloo, Ontario Canada 

Carpentier, Philippe 

Carson, Joseph P. 

Casazza, J. A. 

French grid operator 

Personal comment 

Power Engineers Seeking Truth 

Chen, Shihe 

Power Systems Business Group, CLP Power Hong Kong Ltd. 

Church, Bob 

Management Consulting Services, Inc. 

Clark, Harrison 

Harrison K. Clark 

Cook, David 

NERC 

Cummings, Bob 

Director of Reliability Assessments and Support Services, NERC 

Das, K K 

IEEE member, PowerGrid Corporation of India Limited 

Delea, F. J. 

Power Engineers Seeking Truth 

Delea, Frank 

ConEdison 

Divan, Deepak 

Soft Switching Technologies 

Doumtchenko, Victoria 

MPR Associates 

Duran, Pat 

IMO Ontario 

Durkin, Charles J. 

Northeast Power Coordinating Council (NPCC) 

Eggertson, Bill 

Canadian Association for Renewable Energies 
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Fernandez, Rick 

Personal comment 

Fidrych, Mark 

Western Area Power Authority (WAPA) and Chairman of the NERC Operating Committee 

Furuya, Toshihiko 

Tokyo Electric Power Co., Inc. 

Galatic, Alex 

Personal comment 

Garg, Ajay 

Hydro One Networks Inc. 

Goff in, Dave 

Canadian Chemical Producers Assocation 

Gruber, William M. Ondrey 

Attorney 

Guimond, Pierre 

Canadian Nuclear Association 

Gurdziel, Tom 

Personal comment 

Hakobyan, Spartak and 
Gurgen 

Personal comment 

Flan, Masur 

Personal comment 

Hauser, Carl 

School of Electrical Engineering and Computer Science, Washington State University 

Hebert, Larry 

Thunder Bay Hydro 

Hilt, Dave 

NERC 

Hughes, John P. 

ELCON 

Imai, Shinichi 

Tokyo Electric Power 

Jeyapalan, Jey K. 

Jeyapalan & Associates, LLC 

Johnston, Sidney A. 

Personal comment 

Kane, Michael 

Personal comment 

Katsuras, George 

Independent Electric Market Operator of Ontario 

Kellat, Stephen 

Personal comment 

Kerr, Jack 

Dominion Virginia Power 

Kerr, Jack 

Best Real-time Reliability Analysis Practices Task Force 

Kershaw, Raymond K. 

International Transmission Company 

Kolodziej, Eddie 

Personal comment 

Konow, Hans 

Canadian Electricity Association 

Kormos, Mike 

PJM 

Kucey, Tim 

National Energy Board (Canada) 

Laugier, Alexandre 

Personal comment 

Lawson, Barry 

National Rural Electric Cooperative Association 

Lazarewicz, Matthew L. 

Beacon Power Corp. 

Lee, Stephen 

Electric Power Research Institute 

Leovy, Steve 

Personal comment 

Linda Campbell 

Florida Reliability Coordinating Council 

Loehr, G.C. 

Power Engineers Seeking Truth 

Love, Peter 

Canadian Energy Efficiency Alliance 

Macedo, Frank 

Hydro One 

Maliszewski, R.M. 

Power Engineers Seeking Truth 

McMonagle, Rob 

Canadian Solar Industries Assocation 

Meissner, Joseph 

Personal comment 
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Middlestadt, Bill 

Bonneville Power Administration 

Milter, Carolyn 

Cuyahoga County Board of Commissioners, and member, Community Advisory Panel; panel cre¬ 
ated for Cleveland Electric Illuminating Co. (later First Energy) 

Mitchell, Terry 

Excel Energy 

Moore, Scott 

AEP 

Murphy, Paul 

IMO Ontario 

Museler, William J. 

New York Independent System Operator 

O’Keefe, Brian 

Canadian Union of Public Employees 

Oliver, Fiona 

Canadian Energy Efficiency Alliance 

Ormund, Peter 

Mohawk College 

Pennstone, Mike 

Hydro One 

Pereira, Les 

Personal comment 

Phillips, Margie 

Pennsylvania Services Integration Consortium 

Rocha, Paul X. 

CenterPoint Energy 

Ross, Don 

Prince Edward Island Wind Co-Op 

Rupp, Douglas B 

Ada Core Technologies, Inc. 

Sasson, Mayer 

New York State Reliability Council 

Schwerdt, Ed 

Northeast Power Coordinating Council 

Seppa, Tapani 0. 

The Valley Group, Inc., 

Silverstein, Alison 

Federal Energy Regulatory Commission 

Spears, J. 

Personal comment 

Spencer, Sidney 

Personal comment 

spider 

Personal comment 

Staniford, Stuart 

Personal comment 

Stephens, Eric B. 

Ohio Consumers’ Counsel (OCC) 

Stewart, Bob 

PG&E 

Synesiou, John 

IMS Corporation 

Tarler, Howard A. 

On behalf of Chairman William M. Flynn, New York State Department of Public Service 

Tatro, Phil 

National Grid Company 

Taylor, Carson 

Bonneville Power Administration 

van Welie, Gordon 

ISO New England Inc. 

Van Zandt, Vickie 

Bonneville Power Administration 

Warren, Kim 

IMO Ontario 

Watkins, Don 

Bonneville Power Administration 

Wells, Chuck 

OSISoft 

Wiedman, Tom 

ConEd 

Wightman, Donald 

Utility Workers Union of America 

Wilson, John 

Personal comment 

Winter, Chris 

Conservation Council of Ontario 

Wright, C. Dortch 

On behalf of New Jersey Governor James E. McGreevey 

Zwergel, Dave 

Midwest ISO 
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Appendix D 


NERC Actions to Prevent and Mitigate the Impacts 
of Future Cascading Blackouts 

Preamble 

The Board of Trustees recognizes the paramount importance of a reliable bulk electric system in 
North America. In consideration of the findings of the investigation into the August 14, 2003 
blackout, NERC must take firm and immediate actions to increase public confidence that the 
reliability of the North American bulk electric system is being protected. 

A key finding of the blackout investigators is that violations of existing NERC reliability standards 
contributed directly to the blackout. Pending enactment of federal reliability legislation creating a 
framework for enforcement of mandatory reliability standards, and with the encouragement of the 
Stakeholders Committee, the board is determined to obtain full compliance with all existing and 
future reliability standards and intends to use all legitimate means available to achieve that end. The 
board therefore resolves to: 

• Receive specific information on all violations of NERC standards, including the identities of 
the parties involved; 

• Take firm actions to improve compliance with NERC reliability standards; 

• Provide greater transparency to violations of standards, while respecting the confidential 
nature of some information and the need for a fair and deliberate due process; and 

• Inform and work closely with the Federal Energy Regulatory Commission and other 
applicable federal, state, and provincial regulatory authorities in the United States, Canada, 
and Mexico as needed to ensure public interests are met with respect to compliance with 
reliability standards. 

The board expresses its appreciation to the blackout investigators and the Steering Group for their 
objective and thorough work in preparing a report of recommended NERC actions. With a few 
clarifications, the board approves the report and directs implementation of the recommended actions. 
The board holds the assigned committees and organizations accountable to report to the board the 
progress in completing the recommended actions, and intends itself to publicly report those results. 
The board recognizes the possibility that this action plan may have to be adapted as additional 
analysis is completed, but stresses the need to move forward immediately with the actions as stated. 

Furthermore, the board directs management to immediately advise the board of any significant 
violations of NERC reliability standards, including details regarding the nature and potential 
reliability impacts of the alleged violations and the identity of parties involved. Management shall 
supply to the board in advance of board meetings a detailed report of all violations of reliability 
standards. 

Finally, the board resolves to form a taskforce to develop guidelines for the board to consider with 
regard to the confidentiality of compliance information and disclosure of such information to 
regulatory authorities and the public. 


Approved by the Board of Trustees 1 

February 10, 2004 
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Overview of Investigation Conclusions 


The North American Electric Reliability Council (NERC) has conducted a comprehensive 
investigation of the August 14, 2003 blackout. The results of NERC’s investigation contributed 
significantly to the U.S./Canada Power System Outage Task Force’s November 19, 2003 Interim 
Report identifying the root causes of the outage and the sequence of events leading to and during the 
cascading failure. NERC fully concurs with the conclusions of the Interim Report and continues to 
provide its support to the Task Force through ongoing technical analysis of the outage. Although an 
understanding of what happened and why has been resolved for most aspects of the outage, detailed 
analysis continues in several areas, notably dynamic simulations of the transient phases of the 
cascade and a final verification of the full scope of all violations of NERC and regional reliability 
standards that occurred leading to the outage. 


From its investigation of the August 14 blackout, NERC concludes that: 

• Several entities violated NERC operating policies and planning standards, and those 
violations contributed directly to the start of the cascading blackout. 

• The existing process for monitoring and assuring compliance with NERC and regional 
reliability standards was shown to be inadequate to identify and resolve specific compliance 
violations before those violations led to a cascading blackout. 

• Reliability coordinators and control areas have adopted differing interpretations of the 
functions, responsibilities, authorities, and capabilities needed to operate a reliable power 
system. 

• Problems identified in studies of prior large-scale blackouts were repeated, including 
deficiencies in vegetation management, operator training, and tools to help operators better 
visualize system conditions. 

• In some regions, data used to model loads and generators were inaccurate due to a lack of 
verification through benchmarking with actual system data and field testing. 

• Planning studies, design assumptions, and facilities ratings were not consistently shared and 
were not subject to adequate peer review among operating entities and regions. 

• Available system protection technologies were not consistently applied to optimize the ability 
to slow or stop an uncontrolled cascading failure of the power system. 


Approved by the Board of Trustees 2 

February 10, 2004 


194 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 


Overview of Recommendations 


The Board of Trustees approves the NERC Steering Group recommendations to address these 
shortcomings. The recommendations fall into three categories. 

Actions to Remedy Specific Deficiencies: Specific actions directed to First Energy (FE), the 
Midwest Independent System Operator (MISO), and the PJM Interconnection, LLC (PJM) to correct 
the deficiencies that led to the blackout. 

1. Correct the Direct Causes of the August 14, 2003 Blackout. 

Strategic Initiatives: Strategic initiatives by NERC and the regional reliability councils to strengthen 
compliance with existing standards and to formally track completion of recommended actions from 
August 14, and other significant power system events. 

2. Strengthen the NERC Compliance Enforcement Program. 

3. Initiate Control Area and Reliability Coordinator Reliability Readiness Audits. 

4. Evaluate Vegetation Management Procedures and Results. 

5. Establish a Program to Track Implementation of Recommendations. 

Technical Initiatives: Technical initiatives to prevent or mitigate the impacts of future cascading 
blackouts. 

6. Improve Operator and Reliability Coordinator Training 

7. Evaluate Reactive Power and Voltage Control Practices. 

8. Improve System Protection to Slow or Limit the Spread of Future Cascading Outages. 

9. Clarify Reliability Coordinator and Control Area Functions, Responsibilities, Capabilities 
and Authorities. 

10. Establish Guidelines for Real-Time Operating Tools. 

11. Evaluate Lessons Learned During System Restoration. 

12. Install Additional Time-Synchronized Recording Devices as Needed. 

13. Reevaluate System Design, Planning and Operating Criteria. 

14. Improve System Modeling Data and Data Exchange Practices. 


Market Impacts 

Many of the recommendations in this report have implications for electricity markets and market 
participants, particularly those requiring reevaluation or clarification of NERC and regional 
standards, policies and criteria. Implicit in these recommendations is that the NERC board charges 
the Market Committee with assisting in the implementation of the recommendations and interfacing 
with the North American Energy Standards Board with respect to any necessary business practices. 

Approved by the Board of Trustees 3 
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Recommendation to Remedy Specific Deficiencies 


Recommendation 1. Correct the Direct Causes of the August 14,2003 Blackout. 

NERC’s technical analysis of the August 14 blackout leads it to fully concur with the Task Force 
Interim Report regarding the direct causes of the blackout. The report stated that the principal causes 
of the blackout were that FE did not maintain situational awareness of conditions on its power system 
and did not adequately manage tree growth in its transmission rights-of-way. Contributing factors 
included ineffective diagnostic support provided by MISO as the reliability coordinator for FE and 
ineffective communications between MISO and PJM. 

NERC will take immediate and firm actions to ensure that the same deficiencies that were directly 
causal to the August 14 blackout are corrected. These steps are necessary to assure electricity 
customers, regulators and others with an interest in the reliable delivery of electricity that the power 
system is being operated in a manner that is safe and reliable, and that the specific causes of the 
August 14 blackout have been identified and fixed. 


Recommendation la: FE, MISO, and PJM shall each complete the remedial actions designated 
in Attachment A for their respective organizations and certify to the NERC board no later than 
June 30, 2004, that these specified actions have been completed. Furthermore, each 
organization shall present its detailed plan for completing these actions to the NERC 
committees for technical review on March 23-24, 2004, and to the NERC board for approval no 
later than April 2, 2004. 


Recommendation lb: The NERC Technical Steering Committee shall immediately assign a 
team of experts to assist FE, MISO, and PJM in developing plans that adequately address the 
issues listed in Attachment A, and other remedial actions for which each entity may seek 
technical assistance. 


Approved by the Board of Trustees 4 

February 10, 2004 
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Strategic Initiatives to 

Assure Compliance with Reliability Standards and to Track Recommendations 


Recommendation 2. Strengthen the NERC Compliance Enforcement Program. 

NERC’s analysis of the actions and events leading to the 
August 14 blackout leads it to conclude that several 
violations of NERC operating policies contributed directly 
to an uncontrolled, cascading outage on the Eastern 
Interconnection. NERC continues to investigate additional 
violations of NERC and regional reliability standards and 
expects to issue a final report of those violations in March 
2004. 

In the absence of enabling legislation in the United States 
and complementary actions in Canada and Mexico to 
authorize the creation of an electric reliability organization, 

NERC lacks legally sanctioned authority to enforce 
compliance with its reliability rules. However, the August 
14 blackout is a clear signal that voluntary compliance with 
reliability rules is no longer adequate. NERC and the 
regional reliability councils must assume firm authority to 
measure compliance, to more transparently report 
significant violations that could risk the integrity of the 
interconnected power system, and to take immediate and 
effective actions to ensure that such violations are corrected. 


Violations of NERC standards identified in 

the November 19, 2003 Interim Report: 

1. Following the outage of the Chamberlin- 
Harding 345 kV line, FE did not take the 
necessary actions to return the system to 
a safe operating state within 30 minutes 
(violation of NERC Operating Policy 2). 

2. FE did not notify other systems of an 
impending system emergency (violation 
of NERC Operating Policy 5). 

3. EE’s analysis tools were not used to 
effectively assess system conditions 
(violation of NERC Operating Policy 5). 

4. FE operator training was inadequate for 
maintaining reliable conditions (violation 
of NERC Operating Policy 8). 

5. MISO did not notify other reliability 
coordinators of potential problems 
(violation of NERC Operating Policy 9). 


Recommendation 2a: Each regional reliability council shall report to the NERC Compliance 
Enforcement Program within one month of occurrence all significant 1 violations of NERC 
operating policies and planning standards and regional standards, whether verified or still 
under investigation. Such reports shall confidentially note details regarding the nature and 
potential reliability impacts of the alleged violations and the identity of parties involved. 
Additionally, each regional reliability council shall report quarterly to NERC, in a format 
prescribed by NERC, all violations of NERC and regional reliability council standards. 


Recommendation 2b: Being presented with the results of the investigation of any significant 
violation, and with due consideration of the surrounding facts and circumstances, the NERC 
board shall require an offending organization to correct the violation within a specified time. If 
the board determines that an offending organization is non-responsive and continues to cause a 
risk to the reliability of the interconnected power systems, the board will seek to remedy the 
violation by requesting assistance of the appropriate regulatory authorities in the United States, 
Canada, and Mexico. 


1 Although all violations are important, a significant violation is one that could directly reduce the integrity of the 
interconnected power systems or otherwise cause unfavorable risk to the interconnected power systems. By contrast, a 
violation of a reporting or administrative requirement would not by itself generally be considered a significant violation. 
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Recommendation 2c: The Planning and Operating Committees, working in conjunction with 
the Compliance Enforcement Program, shall review and update existing approved and draft 
compliance templates applicable to current NERC operating policies and planning standards; 
and submit any revisions or new templates to the board for approval no later than March 31, 
2004. To expedite this task, the NERC President shall immediately form a Compliance 
Template Task Force comprised of representatives of each committee. The Compliance 
Enforcement Program shall issue the board-approved compliance templates to the regional 
reliability councils for adoption into their compliance monitoring programs. 


This effort will make maximum use of existing approved and draft compliance templates in order to 
meet the aggressive schedule. The templates are intended to include all existing NERC operating 
policies and planning standards but can be adapted going forward to incorporate new reliability 
standards as they are adopted by the NERC board for implementation in the future. 

When the investigation team’s final report on the August 14 violations of NERC and regional 
standards is available in March, it will be important to assess and understand the lapses that allowed 
violations to go unreported until a large-scale blackout occurred. 


Recommendation 2d: The NERC Compliance Enforcement Program and ECAR shall, within 
three months of the issuance of the final report from the Compliance and Standards 
investigation team, evaluate the identified violations of NERC and regional standards, as 
compared to previous compliance reviews and audits for the applicable entities, and develop 
recommendations to improve the compliance process. 


Recommendation 3. Initiate Control Area and Reliability Coordinator Reliability Readiness 
Audits. 

In conducting its investigation, NERC found that deficiencies in control area and reliability 
coordinator capabilities to perform assigned reliability functions contributed to the August 14 
blackout. In addition to specific violations of NERC and regional standards, some reliability 
coordinators and control areas were deficient in the performance of their reliability functions and did 
not achieve a level of performance that would be considered acceptable practice in areas such as 
operating tools, communications, and training. In a number of cases there was a lack of clarity in the 
NERC policies with regard to what is expected of a reliability coordinator or control area. Although 
the deficiencies in the NERC policies must be addressed (see Recommendation 9), it is equally 
important to recognize that standards cannot prescribe all aspects of reliable operation and that 
minimum standards present a threshold, not a target for performance. Reliability coordinators and 
control areas must perform well, particularly under emergency conditions, and at all times strive for 
excellence in their assigned reliability functions and responsibilities. 


Approved by the Board of Trustees 6 
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Recommendation 3a: The NERC Compliance Enforcement Program and the regional 
reliability councils shall jointly establish a program to audit the reliability readiness of all 
reliability coordinators and control areas, with immediate attention given to addressing the 
deficiencies identified in the August 14 blackout investigation. Audits of all control areas and 
reliability coordinators shall be completed within three years and continue in a three-year 
cycle. The 20 highest priority audits, as determined by the Compliance Enforcement Program, 
will be completed by June 30, 2004. 


Recommendation 3b: NERC will establish a set of baseline audit criteria to which regional 
criteria may be added. The control area requirements will be based on the existing NERC 
Control Area Certification Procedure. Reliability coordinator audits will include evaluation of 
reliability plans, procedures, processes, tools, personnel qualifications, and training. In 
addition to reviewing written documents, the audits will carefully examine the actual practices 
and preparedness of control areas and reliability coordinators. 


Recommendation 3c: The reliability regions, with the oversight and direct participation of 
NERC, will audit each control area’s and reliability coordinator’s readiness to meet these audit 
criteria. FERC and other relevant regulatory agencies will be invited to participate in the 
audits, subject to the same confidentiality conditions as the other members of the audit teams. 


Recommendation 4. Evaluate Vegetation Management Procedures and Results. 

Ineffective vegetation management was a major cause of the August 14 blackout and also contributed 
to other historical large-scale blackouts, such on July 2-3, 1996 in the west. Maintaining 
transmission line rights-of-way (ROW), including maintaining safe clearances of energized lines 
from vegetation, under-build, and other obstructions 2 incurs a substantial ongoing cost in many areas 
of North America. However, it is an important investment for assuring a reliable electric system. 

NERC does not presently have standards for ROW maintenance. Standards on vegetation 
management are particularly challenging given the great diversity of vegetation and growth patterns 
across North America. However, NERC’s standards do require that line ratings are calculated so as 
to maintain safe clearances from all obstructions. Furthermore, in the United States, the National 
Electrical Safety Code (NESC) Rules 232, 233, and 234 detail the minimum vertical and horizontal 
safety clearances of overhead conductors from grounded objects and various types of obstructions. 
NESC Rule 218 addresses tree clearances by simply stating, “Trees that may interfere with 
ungrounded supply conductors should be trimmed or removed.” Several states have adopted their 
own electrical safety codes and similar codes apply in Canada. 

Recognizing that ROW maintenance requirements vary substantially depending on local conditions, 
NERC will focus attention initially on measuring performance as indicated by the number of high 
voltage line trips caused by vegetation rather than immediately move toward developing standards for 


2 Vegetation, such as the trees that caused the initial line trips in FE that led to the August 14, 2003 outage is not the only 
type of obstruction that can breach the safe clearance distances from energized lines. Other examples include under-build 
of telephone and cable TV lines, train crossings, and even nests of certain large bird species. 
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ROW maintenance. This approach has worked well in the Western Electricity Coordinating Council 
(WECC) since being instituted after the 1996 outages. 


Recommendation 4a: NERC and the regional reliability councils shall jointly initiate a program 
to report all bulk electric system 3 transmission line trips resulting from vegetation contact 4 . 

The program will use the successful WECC vegetation monitoring program as a model. 


Recommendation 4b: Beginning with an effective date of January 1, 2004, each transmission 
operator will submit an annual report of all vegetation-related high voltage line trips to its 
respective reliability region. Each region shall assemble a detailed annual report of vegetation- 
related line trips in the region to NERC no later than March 31 for the preceding year, with the 
first reporting to be completed by March 2005 for calendar year 2004. 


Vegetation management practices, including inspection and trimming requirements, can vary 
significantly with geography. Additionally, some entities use advanced techniques such as planting 
beneficial species or applying growth retardants. Nonetheless, the events of August 14 and prior 
outages point to the need for independent verification that viable programs exist for ROW 
maintenance and that the programs are being followed. 


Recommendation 4c: Each bulk electric transmission owner shall make its vegetation 
management procedure, and documentation of work completed, available for review and 
verification upon request by the applicable regional reliability council, NERC, or applicable 
federal, state or provincial regulatory agency. 


Should this approach of monitoring vegetation-related line outages and procedures prove ineffective 
in reducing the number of vegetation-related line outages, NERC will consider the development of 
minimum line clearance standards to assure reliability. 


Recommendation 5. Establish a Program to Track Implementation of Recommendations. 

The August 14 blackout shared a number of contributing factors with prior large-scale blackouts, 
including: 

• Conductors contacting trees 

• Ineffective visualization of power system conditions and lack of situational awareness 

• Ineffective communications 

• Lack of training in recognizing and responding to emergencies 

• Insufficient static and dynamic reactive power supply 

• Need to improve relay protection schemes and coordination 


3 All transmission lines operating at 230 kV and higher voltage, and any other lower voltage lines designated by the 
regional reliability council to be critical to the reliability of the bulk electric system, shall be included in the program. 

4 A line trip includes a momentary opening and reclosing of the line, a lock out, or a combination. For reporting 
purposes, all vegetation-related openings of a line occurring within one 24-hour period should be considered one event. 
Trips known to be caused by severe weather or other natural disaster such as earthquake are excluded. Contact with 
vegetation includes both physical contact and arcing due to insufficient clearance. 
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It is important that recommendations resulting from system outages be adopted consistently by all 
regions and operating entities, not just those directly affected by a particular outage. Several lessons 
learned prior to August 14, if heeded, could have prevented the outage. WECC and NPCC, for 
example, have programs that could be used as models for tracking completion of recommendations. 
NERC and some regions have not adequately tracked completion of recommendations from prior 
events to ensure they were consistently implemented. 


Recommendation 5a: NERC and each regional reliability council shall establish a program for 
documenting completion of recommendations resulting from the August 14 blackout and other 
historical outages, as well as NERC and regional reports on violations of reliability standards, results 
of compliance audits, and lessons learned from system disturbances. Regions shall report quarterly to 
NERC on the status of follow-up actions to address recommendations, lessons learned, and areas 
noted for improvement. NERC staff shall report both NERC activities and a summary of regional 
activities to the board. 


Assuring compliance with reliability standards, evaluating the reliability readiness of reliability 
coordinators and control areas, and assuring recommended actions are achieved will be effective 
steps in reducing the chances of future large-scale outages. However, it is important for NERC to 
also adopt a process for continuous learning and improvement by seeking continuous feedback on 
reliability performance trends, not rely mainly on learning from and reacting to catastrophic failures. 


Recommendation 5b: NERC shall by January 1, 2005 establish a reliability performance 
monitoring function to evaluate and report bulk electric system reliability performance. 


Such a function would assess large-scale outages and near misses to determine root causes and 
lessons learned, similar to the August 14 blackout investigation. This function would incorporate the 
current Disturbance Analysis Working Group and expand that work to provide more proactive 
feedback to the NERC board regarding reliability performance. This program would also gather and 
analyze reliability performance statistics to inform the board of reliability trends. This function could 
develop procedures and capabilities to initiate investigations in the event of future large-scale outages 
or disturbances. Such procedures and capabilities would be shared between NERC and the regional 
reliability councils for use as needed, with NERC and regional investigation roles clearly defined in 
advance. 
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Technical Initiatives to Minimize the Likelihood 
and Impacts of Possible Future Cascading Outages 


Recommendation 6. Improve Operator and Reliability Coordinator Training. 

NERC found during its investigation that some reliability coordinators and control area operators had 
not received adequate training in recognizing and responding to system emergencies. Most notable 
was the lack of realistic simulations and drills for training and verifying the capabilities of operating 
personnel. This training deficiency contributed to the lack of situational awareness and failure to 
declare an emergency when operator intervention was still possible prior to the high speed portion of 
the sequence of events. 


Recommendation 6: All reliability coordinators, control areas, and transmission operators shall 
provide at least five days per year of training and drills in system emergencies, using realistic 
simulations 5 , for each staff person with responsibility for the real-time operation or reliability 
monitoring of the bulk electric system. This system emergency training is in addition to other 
training requirements. Five days of system emergency training and drills are to be completed 
prior to June 30, 2004, with credit given for documented training already completed since July 
1, 2003. Training documents, including curriculum, training methods, and individual training 
records, are to be available for verification during reliability readiness audits. 


NERC has published Continuing Education Criteria specifying appropriate qualifications for 
continuing education providers and training activities. 

In the longer term, the NERC Personnel Certification Governance Committee (PCGC), which is 
independent of the NERC board, should explore expanding the certification requirements of system 
operating personnel to include additional measures of competency in recognizing and responding to 
system emergencies. The current NERC certification examination is a written test of the NERC 
Operating Manual and other references relating to operator job duties, and is not by itself intended to 
be a complete demonstration of competency to handle system emergencies. 


Recommendation 7. Evaluate Reactive Power and Voltage Control Practices. 

The August 14 blackout investigation identified inconsistent practices in northeastern Ohio with 
regard to the setting and coordination of voltage limits and insufficient reactive power supply. 
Although the deficiency of reactive power supply in northeastern Ohio did not directly cause the 
blackout, it was a contributing factor and was a significant violation of existing reliability standards. 

In particular, there appear to have been violations of NERC Planning Standard I.D.S1 requiring static 
and dynamic reactive power resources to meet the performance criteria specified in Table I of 


5 The term “realistic simulations” includes a variety of tools and methods that present operating personnel with situations 
to improve and test diagnostic and decision-making skills in an environment that resembles expected conditions during a 
particular type of system emergency. Although a full replica training simulator is one approach, lower cost alternatives 
such as PC-based simulators, tabletop drills, and simulated communications can be effective training aids if used 
properly. 
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Planning Standard I.A on Transmission Systems. Planning Standard II.B.S1 requires each regional 
reliability council to establish procedures for generating equipment data verification and testing, 
including reactive power capability. Planning Standard III.C.S1 requires that all synchronous 
generators connected to the interconnected transmission systems shall be operated with their 
excitation system in the automatic voltage control mode unless approved otherwise by the 
transmission system operator. S2 of this standard also requires that generators shall maintain a 
network voltage or reactive power output as required by the transmission system operator within the 
reactive capability of the units. 

On one hand, the unsafe conditions on August 14 with respect to voltage in northeastern Ohio can be 
said to have resulted from violations of NERC planning criteria for reactive power and voltage 
control, and those violations should have been identified through the NERC and ECAR compliance 
monitoring programs (addressed by Recommendation 2). On the other hand, investigators believe 
these deficiencies are also symptomatic of a systematic breakdown of the reliability studies and 
practices in FE and the ECAR region that allowed unsafe voltage criteria to be set and used in study 
models and operations. There were also issues identified with reactive characteristics of loads, as 
addressed in Recommendation 14. 


Recommendation 7a: The Planning Committee shall reevaluate within one year the 
effectiveness of the existing reactive power and voltage control standards and how they are 
being implemented in practice in the ten NERC regions. Based on this evaluation, the Planning 
Committee shall recommend revisions to standards or process improvements to ensure voltage 
control and stability issues are adequately addressed. 


Recommendation 7b: ECAR shall no later than June 30, 2004 review its reactive power and 
voltage criteria and procedures, verify that its criteria and procedures are being fully 
implemented in regional and member studies and operations, and report the results to the 
NERC board. 


Recommendation 8. Improve System Protection to Slow or Limit the Spread of Future 
Cascading Outages. 

The importance of automatic control and protection systems in preventing, slowing, or mitigating the 
impact of a large-scale outage cannot be stressed enough. To underscore this point, following the trip 
of the Sammis-Star line at 4:06, the cascading failure into parts of eight states and two provinces, 
including the trip of over 531 generating units and over 400 transmission lines, was completed in the 
next eight minutes. Most of the event sequence, in fact, occurred in the final 12 seconds of the 
cascade. Likewise, the July 2, 1996 failure took less than 30 seconds and the August 10, 1996 failure 
took only 5 minutes. It is not practical to expect operators will always be able to analyze a massive, 
complex system failure and to take the appropriate corrective actions in a matter of a few minutes. 
The NERC investigators believe that two measures would have been crucial in slowing or stopping 
the uncontrolled cascade on August 14: 

• Better application of zone 3 impedance relays on high voltage transmission lines 

• Selective use of under-voltage load shedding. 

Approved by the Board of Trustees 11 

February 10, 2004 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 


203 




First, beginning with the Sammis-Star line trip, most of the remaining line trips during the cascade 
phase were the result of the operation of a zone 3 relay for a perceived overload (a combination of 
high amperes and low voltage) on the protected line. If used, zone 3 relays typically act as an 
overreaching backup to the zone 1 and 2 relays, and are not intentionally set to operate on a line 
overload. However, under extreme conditions of low voltages and large power swings as seen on 
August 14, zone 3 relays can operate for overload conditions and propagate the outage to a wider area 
by essentially causing the system to “break up”. Many of the zone 3 relays that operated during the 
August 14 cascading outage were not set with adequate margins above their emergency thermal 
ratings. For the short times involved, thermal heating is not a problem and the lines should not be 
tripped for overloads. Instead, power system protection devices should be set to address the specific 
condition of concern, such as a fault, out-of-step condition, etc., and should not compromise a power 
system’s inherent physical capability to slow down or stop a cascading event. 


Recommendation 8a: All transmission owners shall, no later than September 30, 2004, evaluate 
the zone 3 relay settings on all transmission lines operating at 230 kV and above for the 
purpose of verifying that each zone 3 relay is not set to trip on load under extreme emergency 
conditions 6 . In each case that a zone 3 relay is set so as to trip on load under extreme 
conditions, the transmission operator shall reset, upgrade, replace, or otherwise mitigate the 
overreach of those relays as soon as possible and on a priority basis, but no later than 
December 31, 2005. Upon completing analysis of its application of zone 3 relays, each 
transmission owner may no later than December 31, 2004 submit justification to NERC for 
applying zone 3 relays outside of these recommended parameters. The Planning Committee 
shall review such exceptions to ensure they do not increase the risk of widening a cascading 
failure of the power system. 


A second key finding with regard to system protection was that if an automatic under-voltage load 
shedding scheme had been in place in the Cleveland-Akron area on August 14, there is a high 
probability the outage could have been limited to that area. 


Recommendation 8b: Each regional reliability council shall complete an evaluation of the 
feasibility and benefits of installing under-voltage load shedding capability in load centers 
within the region that could become unstable as a result of being deficient in reactive power 
following credible multiple-contingency events. The regions are to complete the initial studies 
and report the results to NERC within one year. The regions are requested to promote the 
installation of under-voltage load shedding capabilities within critical areas, as determined by 
the studies to be effective in preventing an uncontrolled cascade of the power system. 


The NERC investigation of the August 14 blackout has identified additional transmission and 
generation control and protection issues requiring further analysis. One concern is that generating 
unit control and protection schemes need to consider the full range of possible extreme system 
conditions, such as the low voltages and low and high frequencies experienced on August 14. The 
team also noted that improvements may be needed in under-frequency load shedding and its 
coordination with generator under-and over-frequency protection and controls. 


6 The NERC investigation team recommends that the zone 3 relay, if used, should not operate at or below 150% of the 
emergency ampere rating of a line, assuming a .85 per unit voltage and a line phase angle of 30 degrees. 
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Recommendation 8c: The Planning Committee shall evaluate Planning Standard III - System 
Protection and Control and propose within one year specific revisions to the criteria to 
adequately address the issue of slowing or limiting the propagation of a cascading failure. The 
board directs the Planning Committee to evaluate the lessons from August 14 regarding relay 
protection design and application and offer additional recommendations for improvement. 


Recommendation 9. Clarify Reliability Coordinator and Control Area Functions, 
Responsibilities, Capabilities and Authorities. 

Ambiguities in the NERC operating policies may have allowed entities involved in the August 14 
blackout to make different interpretations regarding the functions, responsibilities, capabilities, and 
authorities of reliability coordinators and control areas. Characteristics and capabilities necessary to 
enable prompt recognition and effective response to system emergencies must be specified. 

The lack of timely and accurate outage information resulted in degraded performance of state 
estimator and reliability assessment functions on August 14. There is a need to review options for 
sharing of outage information in the operating time horizon (e.g. 15 minutes or less), so as to ensure 
the accurate and timely sharing of outage data necessary to support real-time operating tools such as 
state estimators, real-time contingency analysis, and other system monitoring tools. 

On August 14, reliability coordinator and control area communications regarding conditions in 
northeastern Ohio were ineffective, and in some cases confusing. Ineffective communications 
contributed to a lack of situational awareness and precluded effective actions to prevent the cascade. 
Consistent application of effective communications protocols, particularly during emergencies, is 
essential to reliability. Alternatives should be considered to one-on-one phone calls during an 
emergency to ensure all parties are getting timely and accurate information with a minimum number 
of calls. 

NERC operating policies do not adequately specify critical facilities, leaving ambiguity regarding 
which facilities must be monitored by reliability coordinators. Nor do the policies adequately define 
criteria for declaring transmission system emergencies. Operating policies should also clearly specify 
that curtailing interchange transactions through the NERC Transmission Loading Relief (TLR) 
Procedure is not intended as a method for restoring the system from an actual Operating Security 
Limit violation to a secure operating state. 

Recommendation 9: The Operating Committee shall complete the following by June 30, 

2004: 

• Evaluate and revise the operating policies and procedures, or provide interpretations, 
to ensure reliability coordinator and control area functions, responsibilities, and 
authorities are completely and unambiguously defined. 

• Evaluate and improve the tools and procedures for operator and reliability 
coordinator communications during emergencies. 

• Evaluate and improve the tools and procedures for the timely exchange of outage 
information among control areas and reliability coordinators. 
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Recommendation 10. Establish Guidelines for Real-Time Operating Tools. 

The August 14 blackout was caused by a lack of situational awareness that was in turn the result of 
inadequate reliability tools and backup capabilities. Additionally, the failure of FE’s control 
computers and alarm system contributed directly to the lack of situational awareness. Likewise, 
MISO’s incomplete tool set and the failure of its state estimator to work effectively on August 14 
contributed to the lack of situational awareness. 


Recommendation 10: The Operating Committee shall within one year evaluate the real-time 
operating tools necessary for reliable operation and reliability coordination, including backup 
capabilities. The Operating Committee is directed to report both minimum acceptable 
capabilities for critical reliability functions and a guide of best practices. 


This evaluation should include consideration of the following: 

• Modeling requirements, such as model size and fidelity, real and reactive load modeling, 
sensitivity analyses, accuracy analyses, validation, measurement, observability, update 
procedures, and procedures for the timely exchange of modeling data. 

• State estimation requirements, such as periodicity of execution, monitoring external facilities, 
solution quality, topology error and measurement error detection, failure rates including times 
between failures, presentation of solution results including alarms, and troubleshooting 
procedures. 

• Real-time contingency analysis requirements, such as contingency definition, periodicity of 
execution, monitoring external facilities, solution quality, post-contingency automatic actions, 
failure rates including mean/maximum times between failures, reporting of results, 
presentation of solution results including alarms, and troubleshooting procedures including 
procedures for investigating unsolvable contingencies. 


Recommendation 11. Evaluate Lessons Learned During System Restoration. 

The efforts to restore the power system and customer service following the outage were effective, 
considering the massive amount of load lost and the large number of generators and transmission 
lines that tripped. Fortunately, the restoration was aided by the ability to energize transmission from 
neighboring systems, thereby speeding the recovery. Despite the apparent success of the restoration 
effort, it is important to evaluate the results in more detail to determine opportunities for 
improvement. Blackstart and restoration plans are often developed through study of simulated 
conditions. Robust testing of live systems is difficult because of the risk of disturbing the system or 
interrupting customers. The August 14 blackout provides a valuable opportunity to apply actual 
events and experiences to learn to better prepare for system blackstart and restoration in the future. 
That opportunity should not be lost, despite the relative success of the restoration phase of the outage. 


Recommendation 11a: The Planning Committee, working in conjunction with the Operating 
Committee, NPCC, ECAR, and PJM, shall evaluate the black start and system restoration 
performance following the outage of August 14, and within one year report to the NERC board 
the results of that evaluation with recommendations for improvement. 
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Recommendation lib: All regional reliability councils shall, within six months of the Planning 
Committee report to the NERC board, reevaluate their procedures and plans to assure an 
effective blackstart and restoration capability within their region. 


Recommendation 12. Install Additional Time-Synchronized Recording Devices as Needed. 

A valuable lesson from the August 14 blackout is the importance of having time-synchronized system 
data recorders. NERC investigators labored over thousands of data items to synchronize the 
sequence of events, much like putting together small pieces of a very large puzzle. That process 
would have been significantly improved and sped up if there had been a sufficient number of 
synchronized data recording devices. 

NERC Planning Standard I.F - Disturbance Monitoring does require location of recording devices for 
disturbance analysis. Often time, recorders are available, but they are not synchronized to a time 
standard. All digital fault recorders, digital event recorders, and power system disturbance recorders 
should be time stamped at the point of observation with a precise Global Positioning Satellite (GPS) 
synchronizing signal. Recording and time-synchronization equipment should be monitored and 
calibrated to assure accuracy and reliability. 

Time-synchronized devices, such as phasor measurement units, can also be beneficial for monitoring 
a wide-area view of power system conditions in real-time, such as demonstrated in WECC with their 
Wide-Area Monitoring System (WAMS). 


Recommendation 12a: The reliability regions, coordinated through the NERC Planning 
Committee, shall within one year define regional criteria for the application of synchronized 
recording devices in power plants and substations. Regions are requested to facilitate the 
installation of an appropriate number, type and location of devices within the region as soon as 
practical to allow accurate recording of future system disturbances and to facilitate 
benchmarking of simulation studies by comparison to actual disturbances. 


Recommendation 12b: Facilities owners shall, in accordance with regional criteria, upgrade 
existing dynamic recorders to include GPS time synchronization and, as necessary, install 
additional dynamic recorders. 


Recommendation 13. Reevaluate System Design, Planning and Operating Criteria. 

The investigation report noted that FE entered the day on August 14 with insufficient resources to 
stay within operating limits following a credible set of contingencies, such as the loss of the East 
Lake 5 unit and the Chamberlin-Harding line. NERC will conduct an evaluation of operations 
planning practices and criteria to ensure expected practices are sufficient and well understood. The 
review will reexamine fundamental operating criteria, such as n-1 and the 30-minute limit in 
preparing the system for a next contingency, and Table I Category C.3 of the NERC planning 
standards. Operations planning and operating criteria will be identified that are sufficient to ensure 
the system is in a known and reliable condition at all times, and that positive controls, whether 
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manual or automatic, are available and appropriately located at all times to return the Interconnection 
to a secure condition. Daily operations planning, and subsequent real time operations planning will 
identify available system reserves to meet operating criteria. 


Recommendation 13a: The Operating Committee shall evaluate operations planning and 
operating criteria and recommend revisions in a report to the board within one year. 


Prior studies in the ECAR region did not adequately define the system conditions that were observed 
on August 14. Severe contingency criteria were not adequate to address the events of August 14 that 
led to the uncontrolled cascade. Also, northeastern Ohio was found to have insufficient reactive 
support to serve its loads and meet import criteria. Instances were also noted in the FE system and 
ECAR area of different ratings being used for the same facility by planners and operators and among 
entities, making the models used for system planning and operation suspect. NERC and the regional 
reliability councils must take steps to assure facility ratings are being determined using consistent 
criteria and being effectively shared and reviewed among entities and among planners and operators. 


Recommendation 13b: ECAR shall no later than June 30, 2004 reevaluate its planning and 
study procedures and practices to ensure they are in compliance with NERC standards, ECAR 
Document No. 1, and other relevant criteria; and that ECAR and its members’ studies are 
being implemented as required. 


Recommendation 13c: The Planning Committee, working in conjunction with the regional 
reliability councils, shall within two years reevaluate the criteria, methods and practices used 
for system design, planning and analysis; and shall report the results and recommendations to 
the NERC board. This review shall include an evaluation of transmission facility ratings 
methods and practices, and the sharing of consistent ratings information. 


Regional reliability councils may consider assembling a regional database that includes the ratings of 
all bulk electric system (100 kV and higher voltage) transmission lines, transformers, phase angle 
regulators, and phase shifters. This database should be shared with neighboring regions as needed for 
system planning and analysis. 

NERC and the regional reliability councils should review the scope, frequency, and coordination of 
interregional studies, to include the possible need for simultaneous transfer studies. Study criteria 
will be reviewed, particularly the maximum credible contingency criteria used for system analysis. 
Each control area will be required to identify, for both the planning and operating time horizons, the 
planned emergency import capabilities for each major load area. 


Recommendation 14. Improve System Modeling Data and Data Exchange Practices. 

The after-the-fact models developed to simulate August 14 conditions and events indicate that 
dynamic modeling assumptions, including generator and load power factors, used in planning and 
operating models were inaccurate. Of particular note, the assumptions of load power factor were 
overly optimistic (loads were absorbing much more reactive power than pre-August 14 models 
indicated). Another suspected problem is modeling of shunt capacitors under depressed voltage 
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conditions. Regional reliability councils should establish regional power system models that enable 
the sharing of consistent, validated data among entities in the region. Power flow and transient 
stability simulations should be periodically compared (benchmarked) with actual system events to 
validate model data. Viable load (including load power factor) and generator testing programs are 
necessary to improve agreement between power flows and dynamic simulations and the actual system 
performance. 


Recommendation 14: The regional reliability councils shall within one year establish and begin 
implementing criteria and procedures for validating data used in power flow models and 
dynamic simulations by benchmarking model data with actual system performance. Validated 
modeling data shall be exchanged on an inter-regional basis as needed for reliable system 
planning and operation. 


During the data collection phase of the blackout investigation, when control areas were asked for 
information pertaining to merchant generation within their area, data was frequently not supplied. 
The reason often given was that the control area did not know the status or output of the generator at 
a given point in time. Another reason was the commercial sensitivity or confidentiality of such data. 
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Appendix E 

List of Electricity Acronyms 


AEP 

BPA 

CA 

CNSC 

DOE 

ECAR 

EIA 

EMS 

ERCOT 

ERO 

FE 

FERC 

FRCC 

GW, GWh 

IEEE 

IPP 

ISAC 

kV, kVAr 

kW, kWh 

MAAC 

MAIN 

MAPP 

MECS 

MVA, MVAr 

MW, MWh 

NERC 

NESC 

NPCC 

NRC 

NRCan 

OASIS 

OETD 

PJM 

PUC 

RC 

ROW 

RRC 

RTO 

SCADA 

SERC 

SPP 

TVA 

WECC 


American Electric Power 
Bonneville Power Administration 
Control area 

Canadian Nuclear Safety Commission 
Department of Energy (U.S.) 

East Central Area Reliability Coordination Agreement 
Energy Information Administration (U.S. DOE) 

Energy management system 
Electric Reliability Council of Texas 
Electric reliability organization 
FirstEnergy 

Federal Energy Regulatory Commission (U.S.) 

Florida Reliability Coordinating Council 
Gigawatt, Gigawatt-hour 

Institute of Electrical and Electronics Engineers 

Independent power producer 

Information Sharing and Analysis Center 

Kilovolt, Kilovolt-Amperes-reactive 

Kilowatt, Kilowatt-hour 

Mid-Atlantic Area Council 

Mid-America Interconnected Network 

Mid-Continent Area Power Pool 

Michigan Electrical Coordinated Systems 

Megavolt-Amperes, Megavolt-Amperes-reactive 

Megawatt, Megawatt-hour 

North American Electric Reliability Council 

National Electricity Safety Code 

Northeast Power Coordinating Council 

Nuclear Regulatory Commission (U.S.) 

Natural Resources Canada 

Open Access Same Time Information Service 

Office of Electric Transmission and Distribution (U.S. DOE) 

PJM Interconnection 

Public utility (or public service) commission (state) 

Reliability coordinator 

Right-of-Way (transmission or distribution line, pipeline, etc.) 

Regional reliability council 

Regional Transmission Organization 

Supervisory control and data acquisition 

Southeast Electric Reliability Council 

Southwest Power Pool 

Tennessee Valley Authority (U.S.) 

Western Electricity Coordinating Council 
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Appendix F 

Electricity Glossary 


AC: Alternating current; current that changes peri¬ 
odically (sinusoidally) with time. 

ACE: Area Control Error in MW. A negative value 
indicates a condition of under-generation relative 
to system load and imports, and a positive value 
denotes over-generation. 

Active Power: See “Real Power.” 

Adequacy: The ability of the electric system to 
supply the aggregate electrical demand and energy 
requirements of customers at all times, taking into 
account scheduled and reasonably expected 
unscheduled outages of system elements. 

AGC: Automatic Generation Control is a computa¬ 
tion based on measured frequency and computed 
economic dispatch. Generation equipment under 
AGC automatically responds to signals from an 
EMS computer in real time to adjust power output 
in response to a change in system frequency, 
tie-line loading, or to a prescribed relation 
between these quantities. Generator output is 
adjusted so as to maintain a target system fre¬ 
quency (usually 60 Hz) and any scheduled MW 
interchange with other areas. 

Apparent Power: The product of voltage and cur¬ 
rent phasors. It comprises both active and reactive 
power, usually expressed in kilovoltamperes 
(kVA) or megavoltamperes (MVA). 

Blackstart Capability: The ability of a generating 
unit or station to go from a shutdown condition to 
an operating condition and start delivering power 
without assistance from the bulk electric system. 

Bulk Electric System: A term commonly applied 
to the portion of an electric utility system that 
encompasses the electrical generation resources 
and bulk transmission system. 

Bulk Transmission: A functional or voltage classi¬ 
fication relating to the higher voltage portion of 
the transmission system, specifically, lines at or 
above a voltage level of 115 kV. 

Bus: Shortened from the word busbar, meaning a 
node in an electrical network where one or more 
elements are connected together. 

Capacitor Bank: A capacitor is an electrical device 
that provides reactive power to the system and is 


often used to compensate for reactive load and 
help support system voltage. A bank is a collection 
of one or more capacitors at a single location. 

Capacity: The rated continuous load-carrying 
ability, expressed in megawatts (MW) or 
megavolt-amperes (MVA) of generation, transmis¬ 
sion, or other electrical equipment. 

Cascading: The uncontrolled successive loss of 
system elements triggered by an incident. Cas¬ 
cading results in widespread service interruption, 
which cannot be restrained from sequentially 
spreading beyond an area predetermined by 
appropriate studies. 

Circuit: A conductor or a system of conductors 
through which electric current flows. 

Circuit Breaker: A switching device connected to 
the end of a transmission line capable of opening 
or closing the circuit in response to a command, 
usually from a relay. 

Control Area: An electric power system or combi¬ 
nation of electric power systems to which a com¬ 
mon automatic control scheme is applied in order 
to: (1) match, at all times, the power output of the 
generators within the electric power system(s) and 
capacity and energy purchased from entities out¬ 
side the electric power system(s), with the load in 
the electric power system(s); (2) maintain, within 
the limits of Good Utility Practice, scheduled 
interchange with other Control Areas; (3) main¬ 
tain the frequency of the electric power system(s) 
within reasonable limits in accordance with 
Good Utility Practice; and (4) provide sufficient 
generating capacity to maintain operating reserves 
in accordance with Good Utility Practice. 

Contingency: The unexpected failure or outage of 
a system component, such as a generator, trans¬ 
mission line, circuit breaker, switch, or other elec¬ 
trical element. A contingency also may include 
multiple components, which are related by situa¬ 
tions leading to simultaneous component outages. 

Control Area Operator: An individual or organi¬ 
zation responsible for controlling generation to 
maintain interchange schedule with other control 
areas and contributing to the frequency regulation 
of the interconnection. The control area is an 
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electric system that is bounded by interconnec¬ 
tion metering and telemetry. 

Current (Electric): The rate of flow of electrons in 
an electrical conductor measured in Amperes. 

Curtailability: The right of a transmission pro¬ 
vider to interrupt all or part of a transmission ser¬ 
vice due to constraints that reduce the capability 
of the transmission network to provide that trans¬ 
mission service. Transmission service is to be cur¬ 
tailed only in cases where system reliability is 
threatened or emergency conditions exist. 

Demand: The rate at which electric energy is 
delivered to consumers or by a system or part of a 
system, generally expressed in kilowatts or mega¬ 
watts, at a given instant or averaged over any des¬ 
ignated interval of time. Also see “Load.” 

DC: Direct current; current that is steady and does 
not change sinusoidally with time (see “AC”). 

Dispatch Operator: Control of an integrated elec¬ 
tric system involving operations such as assign¬ 
ment of levels of output to specific generating 
stations and other sources of supply; control of 
transmission lines, substations, and equipment; 
operation of principal interties and switching; and 
scheduling of energy transactions. 

Distribution: For electricity, the function of dis¬ 
tributing electric power using low voltage lines to 
retail customers. 

Distribution Network: The portion of an electric 
system that is dedicated to delivering electric 
energy to an end user, at or below 69 kV. The dis¬ 
tribution network consists primarily of low- 
voltage lines and transformers that “transport” 
electricity from the bulk power system to retail 
customers. 

Disturbance: An unplanned event that produces 
an abnormal system condition. 

Electrical Energy: The generation or use of elec¬ 
tric power by a device over a period of time, 
expressed in kilowatthours (kWh), megawatt- 
hours (MWh), or gigawatthours (GWh). 

Electric Utility: Person, agency, authority, or 
other legal entity or instrumentality that owns or 
operates facilities for the generation, transmis¬ 
sion, distribution, or sale of electric energy pri¬ 
marily for use by the public, and is defined as a 
utility under the statutes and rules by which it is 
regulated. An electric utility can be investor- 
owned, cooperatively owned, or government- 


owned (by a federal agency, crown corporation, 
State, provincial government, municipal govern¬ 
ment, and public power district). 

Element: Any electric device with terminals that 
may be connected to other electric devices, such 
as a generator, transformer, circuit, circuit 
breaker, or bus section. 

Energy Emergency: A condition when a system or 
power pool does not have adequate energy 
resources (including water for hydro units) to sup¬ 
ply its customers’ expected energy requirements. 

Emergency: Any abnormal system condition that 
requires automatic or immediate manual action to 
prevent or limit loss of transmission facilities or 
generation supply that could adversely affect the 
reliability of the electric system. 

Emergency Voltage Limits: The operating voltage 
range on the interconnected systems that is 
acceptable for the time, sufficient for system 
adjustments to be made following a facility outage 
or system disturbance. 

EMS: An energy management system is a com¬ 
puter control system used by electric utility dis¬ 
patchers to monitor the real time performance of 
various elements of an electric system and to con¬ 
trol generation and transmission facilities. 

Fault: A fault usually means a short circuit, but 
more generally it refers to some abnormal system 
condition. Faults are often random events. 

Federal Energy Regulatory Commission (FERC): 

Independent Federal agency that, among other 
responsibilities, regulates the transmission and 
wholesale sales of electricity in interstate 
commerce. 

Flashover: A plasma arc initiated by some event 
such as lightning. Its effect is a short circuit on the 
network. 

Flowgate: A single or group of transmission ele¬ 
ments intended to model MW flow impact relating 
to transmission limitations and transmission ser¬ 
vice usage. 

Forced Outage: The removal from service avail¬ 
ability of a generating unit, transmission line, or 
other facility for emergency reasons or a condition 
in which the equipment is unavailable due to 
unanticipated failure. 

Frequency: The number of complete alternations 
or cycles per second of an alternating current, 
measured in Hertz. The standard frequency in the 
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United States is 60 Hz. In some other countries the 
standard is 50 Hz. 

Frequency Deviation or Error: A departure from 
scheduled frequency; the difference between 
actual system frequency and the scheduled sys¬ 
tem frequency. 

Frequency Regulation: The ability of a Control 
Area to assist the interconnected system in main¬ 
taining scheduled frequency. This assistance can 
include both turbine governor response and auto¬ 
matic generation control. 

Frequency Swings: Constant changes in fre¬ 
quency from its nominal or steady-state value. 

Generation (Electricity): The process of produc¬ 
ing electrical energy from other forms of energy; 
also, the amount of electric energy produced, usu¬ 
ally expressed in kilowatt hours (kWh) or mega¬ 
watt hours (MWh). 

Generator: Generally, an electromechanical 
device used to convert mechanical power to elec¬ 
trical power. 

Grid: An electrical transmission and/or distribu¬ 
tion network. 

Grid Protection Scheme: Protection equipment 
for an electric power system, consisting of circuit 
breakers, certain equipment for measuring electri¬ 
cal quantities (e.g., current and voltage sensors) 
and devices called relays. Each relay is designed to 
protect the piece of equipment it has been 
assigned from damage. The basic philosophy in 
protection system design is that any equipment 
that is threatened with damage by a sustained 
fault is to be automatically taken out of service. 

Ground: A conducting connection between an 
electrical circuit or device and the earth. A ground 
may be intentional, as in the case of a safety 
ground, or accidental, which may result in high 
overcurrents. 

Imbalance: A condition where the generation and 
interchange schedules do not match demand. 

Impedance: The total effects of a circuit that 
oppose the flow of an alternating current consist¬ 
ing of inductance, capacitance, and resistance. It 
can be quantified in the units of ohms. 

Independent System Operator (ISO): An organi¬ 
zation responsible for the reliable operation of the 
power grid under its purview and for providing 
open transmission access to all market partici¬ 
pants on a nondiscriminatory basis. An ISO is 


usually not-for-profit and can advise utilities 
within its territory on transmission expansion and 
maintenance but does not have the responsibility 
to carry out the functions. 

Interchange: Electric power or energy that flows 
across tie-lines from one entity to another, 
whether scheduled or inadvertent. 

Interconnected System: A system consisting of 
two or more individual electric systems that nor¬ 
mally operate in synchronism and have connect¬ 
ing tie lines. 

Interconnection: When capitalized, any one of the 
five major electric system networks in North 
America: Eastern, Western, ERCOT (Texas), Que¬ 
bec, and Alaska. When not capitalized, the facili¬ 
ties that connect two systems or Control Areas. 
Additionally, an interconnection refers to the 
facilities that connect a nonutility generator to a 
Control Area or system. 

Interface: The specific set of transmission ele¬ 
ments between two areas or between two areas 
comprising one or more electrical systems. 

ISAC: Information Sharing and Analysis Centers 
(ISACs) are designed by the private sector and 
serve as a mechanism for gathering, analyzing, 
appropriately sanitizing and disseminating pri¬ 
vate sector information. These centers could also 
gather, analyze, and disseminate information from 
Government for further distribution to the private 
sector. ISACs also are expected to share important 
information about vulnerabilities, threats, intru¬ 
sions, and anomalies, but do not interfere with 
direct information exchanges between companies 
and the Government. 

Island: A portion of a power system or several 
power systems that is electrically separated from 
the interconnection due to the disconnection of 
transmission system elements. 

Kilovar (kVAr): Unit of alternating current reac¬ 
tive power equal to 1,000 VArs. 

Kilovolt (kV): Unit of electrical potential equal to 
1,000 Volts. 

Kilovolt-Amperes (kVA): Unit of apparent power 
equal to 1,000 volt amperes. Here, apparent power 
is in contrast to real power. On AC systems the 
voltage and current will not be in phase if reactive 
power is being transmitted. 

Kilowatthour (kWh): Unit of energy equaling one 
thousand watthours, or one kilowatt used over 
one hour. This is the normal quantity used for 
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metering and billing electricity customers. The 
retail price for a kWh varies from approximately 4 
cents to 15 cents. At a 100% conversion efficiency, 
one kWh is equivalent to about 4 fluid ounces of 
gasoline, 3/16 pound of liquid petroleum, 3 cubic 
feet of natural gas, or 1/4 pound of coal. 

Line Trip: Refers to the automatic opening of the 
conducting path provided by a transmission line 
by the circuit breakers. These openings or “trips” 
are to protect the transmission line during faulted 
conditions. 

Load (Electric): The amount of electric power 
delivered or required at any specific point or 
points on a system. The requirement originates at 
the energy-consuming equipment of the consum¬ 
ers. See “Demand.” 

Load Shedding: The process of deliberately 
removing (either manually or automatically) pre¬ 
selected customer demand from a power system in 
response to an abnormal condition, to maintain 
the integrity of the system and minimize overall 
customer outages. 

Lockout: A state of a transmission line following 
breaker operations where the condition detected 
by the protective relaying was not eliminated by 
temporarily opening and reclosing the line, possi¬ 
bly several times. In this state, the circuit breakers 
cannot generally be reclosed without resetting a 
lockout device. 

Market Participant: An entity participating in the 
energy marketplace by buying/selling transmis¬ 
sion rights, energy, or ancillary services into, out 
of, or through an ISO-controlled grid. 

Megawatthour (MWh): One million watthours. 

Metered Value: A measured electrical quantity 
that may be observed through telemetering, super¬ 
visory control and data acquisition (SCADA), or 
other means. 

Metering: The methods of applying devices that 
measure and register the amount and direction of 
electrical quantities with respect to time. 

NERC Interregional Security Network (ISN): A 

communications network used to exchange elec¬ 
tric system operating parameters in near real time 
among those responsible for reliable operations of 
the electric system. The ISN provides timely and 
accurate data and information exchange among 
reliability coordinators and other system opera¬ 
tors. The ISN, which operates over the frame relay 
NERCnet system, is a private Intranet that is 


capable of handling additional applications 
between participants. 

Normal (Precontingency) Operating Procedures: 

Operating procedures that are normally invoked 
by the system operator to alleviate potential facil¬ 
ity overloads or other potential system problems 
in anticipation of a contingency. 

Normal Voltage Limits: The operating voltage 
range on the interconnected systems that is 
acceptable on a sustained basis. 

North American Electric Reliability Council 
(NERC): A not-for-profit company formed by the 
electric utility industry in 1968 to promote the 
reliability of the electricity supply in North Amer¬ 
ica. NERC consists of nine Regional Reliability 
Councils and one Affiliate, whose members 
account for virtually all the electricity supplied in 
the United States, Canada, and a portion of Baja 
California Norte, Mexico. The members of these 
Councils are from all segments of the electricity 
supply industry: investor-owned, federal, rural 
electric cooperative, state/municipal, and provin¬ 
cial utilities, independent power producers, and 
power marketers. The NERC Regions are: East 
Central Area Reliability Coordination Agreement 
(ECAR); Electric Reliability Council of Texas 
(ERCOT); Mid-Atlantic Area Council (MAAC); 
Mid-America Interconnected Network (MAIN); 
Mid-Continent Area Power Pool (MAPP); North¬ 
east Power Coordinating Council (NPCC); South¬ 
eastern Electric Reliability Council (SERC); 
Southwest Power Pool (SPP); Western Systems 
Coordinating Council (WSCC); and Alaskan Sys¬ 
tems Coordination Council (ASCC, Affiliate). 

OASIS: Open Access Same Time Information Ser¬ 
vice (OASIS), developed by the Electric Power 
Research Institute, is designed to facilitate open 
access by providing users with access to informa¬ 
tion on transmission services and availability, 
plus facilities for transactions. 

Operating Criteria: The fundamental principles 
of reliable interconnected systems operation, 
adopted by NERC. 

Operating Guides: Operating practices that a Con¬ 
trol Area or systems functioning as part of a Con¬ 
trol Area may wish to consider. The application of 
Guides is optional and may vary among Control 
Areas to accommodate local conditions and indi¬ 
vidual system requirements. 

Operating Policies: The doctrine developed for 
interconnected systems operation. This doctrine 
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consists of Criteria, Standards, Requirements, 
Guides, and instructions, which apply to all Con¬ 
trol Areas. 

Operating Procedures: A set of policies, practices, 
or system adjustments that may be automatically 
or manually implemented by the system operator 
within a specified time frame to maintain the 
operational integrity of the interconnected electric 
systems. 

Operating Requirements: Obligations of a Control 
Area and systems functioning as part of a Control 
Area. 

Operating Security Limit: The value of a system 
operating parameter (e.g. total power transfer 
across an interface) that satisfies the most limiting 
of prescribed pre- and post-contingency operating 
criteria as determined by equipment loading capa¬ 
bility and acceptable stability and voltage condi¬ 
tions. It is the operating limit to be observed so 
that the transmission system will remain reliable 
even if the worst contingency occurs. 

Operating Standards: The obligations of a Control 
Area and systems functioning as part of a Control 
Area that are measurable. An Operating Standard 
may specify monitoring and surveys for 
compliance. 

Outage: The period during which a generating 
unit, transmission line, or other facility is out of 
service. 

Planning Guides: Good planning practices and 
considerations that Regions, subregions, power 
pools, or individual systems should follow. The 
application of Planning Guides may vary to match 
local conditions and individual system 
requirements. 

Pl a nnin g Policies: The framework for the reliabil¬ 
ity of interconnected bulk electric supply in terms 
of responsibilities for the development of and con¬ 
formance to NERC Planning Principles and 
Guides and Regional planning criteria or guides, 
and NERC and Regional issues resolution pro¬ 
cesses. NERC Planning Procedures, Principles, 
and Guides emanate from the Planning Policies. 

Pl a nnin g Principles: The fundamental character¬ 
istics of reliable interconnected bulk electric sys¬ 
tems and the tenets for planning them. 

Pl a nnin g Procedures: An explanation of how 
the Planning Policies are addressed and imple¬ 
mented by the NERC Engineering Committee, its 


subgroups, and the Regional Councils to achieve 
bulk electric system reliability. 

Post-contingency Operating Procedures: Oper¬ 
ating procedures that may be invoked by the sys¬ 
tem operator to mitigate or alleviate system 
problems after a contingency has occurred. 

Protective Relay: A device designed to detect 
abnormal system conditions, such as electrical 
shorts on the electric system or within generating 
plants, and initiate the operation of circuit break¬ 
ers or other control equipment. 

Power/Phase Angle: The angular relationship 
between an AC (sinusoidal) voltage across a cir¬ 
cuit element and the AC (sinusoidal) current 
through it. The real power that can flow is related 
to this angle. 

Power: See “Real Power.” 

Power Flow: See “Current.” 

Rate: The authorized charges per unite or level of 
consumption for a specified time period for any of 
the classes of utility services provided to a 
customer. 

Rating: The operational limits of an electric sys¬ 
tem, facility, or element under a set of specified 
conditions. 

Reactive Power: The portion of electricity that 
establishes and sustains the electric and magnetic 
fields of alternating-current equipment. Reactive 
power must be supplied to most types of magnetic 
equipment, such as motors and transformers. It 
also must supply the reactive losses on transmis¬ 
sion facilities. Reactive power is provided by gen¬ 
erators, synchronous condensers, or electrostatic 
equipment such as capacitors and directly influ¬ 
ences electric system voltage. It is usually 
expressed in kilovars (kVAr) or megavars (MVAr), 
and is the mathematical product of voltage and 
current consumed by reactive loads. Examples of 
reactive loads include capacitors and inductors. 
These types of loads, when connected to an ac 
voltage source, will draw current, but because the 
current is 90 degrees out of phase with the applied 
voltage, they actually consume no real power. 

Readiness: The extent to which an organizational 
entity is prepared to meet the functional require¬ 
ments set by NERC or its regional council for enti¬ 
ties of that type or class. 

Real Power: Also known as “active power.” The 
rate at which work is performed or that energy is 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations 


217 


transferred, usually expressed in kilowatts (kW) or 
megawatts (MW). The terms “active power” or 
“real power” are often used in place of the term 
power alone to differentiate it from reactive 
power. 

Real-Time Operations: The instantaneous opera¬ 
tions of a power system as opposed to those opera¬ 
tions that are simulated. 

Regional Reliability Council: One of ten Electric 
Reliability Councils that form the North American 
Electric Reliability Council (NERC). 

Regional Tr a n s mission Operator (RTO): An orga¬ 
nization that is independent from all generation 
and power marketing interests and has exclusive 
responsibility for electric transmission grid opera¬ 
tions, short-term electric reliability, and transmis¬ 
sion services within a multi-State region. To 
achieve those objectives, the RTO manages trans¬ 
mission facilities owned by different companies 
and encompassing one, large, contiguous geo¬ 
graphic area. 

Regulations: Rules issued by regulatory authori¬ 
ties to implement laws passed by legislative 
bodies. 

Relay: A device that controls the opening and sub¬ 
sequent reclosing of circuit breakers. Relays take 
measurements from local current and voltage 
transformers, and from communication channels 
connected to the remote end of the lines. A relay 
output trip signal is sent to circuit breakers when 
needed. 

Relay Setting: The parameters that determine 
when a protective relay will initiate operation of 
circuit breakers or other control equipment. 

Reliability: The degree of performance of the ele¬ 
ments of the bulk electric system that results in 
electricity being delivered to customers within 
accepted standards and in the amount desired. 
Reliability may be measured by the frequency, 
duration, and magnitude of adverse effects on the 
electric supply. Electric system reliability can be 
addressed by considering two basic and func¬ 
tional aspects of the electric system, Adequacy 
and Security. 

Reliability Coordinator: An individual or organi¬ 
zation responsible for the safe and reliable opera¬ 
tion of the interconnected transmission system for 
their defined area, in accordance with NERC reli¬ 
ability standards, regional criteria, and subregion¬ 
al criteria and practices. This entity facilitates the 
sharing of data and information about the status 
of the Control Areas for which it is responsible, 


establishes a security policy for these Control 
Areas and their interconnections, and coordinates 
emergency operating procedures that rely on com¬ 
mon operating terminology, criteria, and 
standards. 

Resistance: The characteristic of materials to 
restrict the flow of current in an electric circuit. 
Resistance is inherent in any electric wire, includ¬ 
ing those used for the transmission of electric 
power. Resistance in the wire is responsible for 
heating the wire as current flows through it and 
the subsequent power loss due to that heating. 

Restoration: The process of returning generators 
and transmission system elements and restoring 
load following an outage on the electric system. 

Right-of-Way (ROW) Maintenance: Activities by 
utilities to maintain electrical clearances along 
transmission or distribution lines. 

Safe Limits: System limits on quantities such as 
voltage or power flows such that if the system is 
operated within these limits it is secure and 
reliable. 

SCADA: Supervisory Control and Data Acquisi¬ 
tion system; a system of remote control and telem¬ 
etry used to monitor and control the electric 
system. 

Schedule: An agreed-upon transaction size (mega¬ 
watts), start and end time, beginning and ending 
ramp times and rate, and type required for deliv¬ 
ery and receipt of power and energy between the 
contracting parties and the Control Area(s) 
involved in the transaction. 

Scheduling Coordinator: An entity certified by an 
ISO or RTO for the purpose of undertaking sched¬ 
uling functions. 

Seams: The boundaries between adjacent electric¬ 
ity-related organizations. Differences in regulatory 
requirements or operating practices may create 
“seams problems.” 

Security: The ability of the electric system to with¬ 
stand sudden disturbances such as electric short 
circuits or unanticipated loss of system elements. 

Security Coordinator: An individual or organiza¬ 
tion that provides the security assessment and 
emergency operations coordination for a group of 
Control Areas. 

Short Circuit: A low resistance connection unin¬ 
tentionally made between points of an electrical 
circuit, which may result in current flow far above 
normal levels. 
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Shunt Capacitor Bank: Shunt capacitors are 
capacitors connected from the power system to an 
electrical ground. They are used to supply kilovars 
(reactive power) to the system at the point where 
they are connected. A shunt capacitor bank is a 
group of shunt capacitors. 

Single Contingency: The sudden, unexpected fail¬ 
ure or outage of a system facility(s) or element(s) 
(generating unit, transmission line, transformer, 
etc.). Elements removed from service as part of the 
operation of a remedial action scheme are consid¬ 
ered part of a single contingency. 

Special Protection System: An automatic protec¬ 
tion system designed to detect abnormal or prede¬ 
termined system conditions, and take corrective 
actions other than and/or in addition to the isola¬ 
tion of faulted components. 

Stability: The ability of an electric system to main¬ 
tain a state of equilibrium during normal and 
abnormal system conditions or disturbances. 

Stability Limit: The maximum power flow possi¬ 
ble through a particular point in the system while 
maintaining stability in the entire system or the 
part of the system to which the stability limit 
refers. 

State Estimator: Computer software that takes 
redundant measurements of quantities related to 
system state as input and provides an estimate of 
the system state (bus voltage phasors). It is used to 
confirm that the monitored electric power system 
is operating in a secure state by simulating the sys¬ 
tem both at the present time and one step ahead, 
for a particular network topology and loading con¬ 
dition. With the use of a state estimator and its 
associated contingency analysis software, system 
operators can review each critical contingency to 
determine whether each possible future state is 
within reliability limits. 

Station: A node in an electrical network where 
one or more elements are connected. Examples 
include generating stations and substations. 

Storage: Energy transferred form one entity to 
another entity that has the ability to conserve the 
energy (i.e., stored as water in a reservoir, coal in a 
pile, etc.) with the intent that the energy will be 
returned at a time when such energy is more use- 
able to the original supplying entity. 

Substation: Facility equipment that switches, 
changes, or regulates electric voltage. 


Sub transmission: A functional or voltage classifi¬ 
cation relating to lines at voltage levels between 
69kV and 115kV. 

Supervisory Control and Data Acquisition 
(SCADA): See SCADA. 

Surge: A transient variation of current, voltage, or 
power flow in an electric circuit or across an elec¬ 
tric system. 

Surge Impedance Loading: The maximum 
amount of real power that can flow down a 
lossless transmission line such that the line does 
not require any VArs to support the flow. 

Switching Station: Facility equipment used to tie 
together two or more electric circuits through 
switches. The switches are selectively arranged to 
permit a circuit to be disconnected, or to change 
the electric connection between the circuits. 

Synchronize: The process of connecting two pre¬ 
viously separated alternating current apparatuses 
after matching frequency, voltage, phase angles, 
etc. (e.g., paralleling a generator to the electric 
system). 

System: An interconnected combination of gener¬ 
ation, transmission, and distribution components 
comprising an electric utility and independent 
power producer(s) (IPP), or group of utilities and 
IPP(s). 

System Operator: An individual at an electric sys¬ 
tem control center whose responsibility it is to 
monitor and control that electric system in real 
time. 

System Reliability: A measure of an electric sys¬ 
tem’s ability to deliver uninterrupted service at 
the proper voltage and frequency. 

Thermal Limit: A power flow limit based on the 
possibility of damage by heat. Heating is caused by 
the electrical losses which are proportional to the 
square of the real power flow. More precisely, a 
thermal limit restricts the sum of the squares of 
real and reactive power. 

Tie-line: The physical connection (e.g. transmis¬ 
sion lines, transformers, switch gear, etc.) between 
two electric systems that permits the transfer of 
electric energy in one or both directions. 

Time Error: An accumulated time difference 
between Control Area system time and the time 
standard. Time error is caused by a deviation in 
Interconnection frequency from 60.0 Hertz. 

Time Error Correction: An offset to the Intercon¬ 
nection’s scheduled frequency to correct for the 
time error accumulated on electric clocks. 
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Transactions: Sales of bulk power via the trans¬ 
mission grid. 

Transfer Limit: The maximum amount of power 
that can be transferred in a reliable manner from 
one area to another over all transmission lines (or 
paths) between those areas under specified system 
conditions. 

Transformer: A device that operates on magnetic 
principles to increase (step up) or decrease (step 
down) voltage. 

Transient Stability: The ability of an electric sys¬ 
tem to maintain synchronism between its parts 
when subjected to a disturbance and to regain a 
state of equilibrium following that disturbance. 

Transmission: An interconnected group of lines 
and associated equipment for the movement or 
transfer of electric energy between points of sup¬ 
ply and points at which it is transformed for deliv¬ 
ery to customers or is delivered to other electric 
systems. 

Transmission Loading Relief (TLR): A procedure 
used to manage congestion on the electric trans¬ 
mission system. 

Transmission Margin: The difference between 
the maximum power flow a transmission line can 
handle and the amount that is currently flowing 
on the line. 

Transmission Operator: NERC-certified party 
responsible for monitoring and assessing local 
reliability conditions, who operates the transmis¬ 
sion facilities, and who executes switching orders 
in support of the Reliability Authority. 

Transmission Overload: A state where a transmis¬ 
sion line has exceeded either a normal or emer¬ 
gency rating of the electric conductor. 

Transmission Owner (TO) or Transmission Pro¬ 
vider: Any utility that owns, operates, or controls 


facilities used for the transmission of electric 
energy. 

Trip: The opening of a circuit breaker or breakers 
on an electric system, normally to electrically iso¬ 
late a particular element of the system to prevent it 
from being damaged by fault current or other 
potentially damaging conditions. See “Line Trip” 
for example. 

Voltage: The electrical force, or “pressure,” that 
causes current to flow in a circuit, measured in 
Volts. 

Voltage Collapse (decay): An event that occurs 
when an electric system does not have adequate 
reactive support to maintain voltage stability. 
Voltage Collapse may result in outage of system 
elements and may include interruption in service 
to customers. 

Voltage Control: The control of transmission volt¬ 
age through adjustments in generator reactive out¬ 
put and transformer taps, and by switching 
capacitors and inductors on the transmission and 
distribution systems. 

Voltage Limits: A hard limit above or below which 
is an undesirable operating condition. Normal 
limits are between 95 and 105 percent of the nomi¬ 
nal voltage at the bus under discussion. 

Voltage Reduction: A procedure designed to 
deliberately lower the voltage at a bus. It is often 
used as a means to reduce demand by lowering the 
customer’s voltage. 

Voltage Stability: The condition of an electric sys¬ 
tem in which the sustained voltage level is con¬ 
trollable and within predetermined limits. 

Watthour (Wh): A unit of measure of electrical 
energy equal to 1 watt of power supplied to, or 
taken from, an electric circuit steadily for 1 hour. 
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Appendix G 


Transmittal Letters from the Three Working Groups 

Mr. James W. Glotfelty 
Director, Office of Electric Transmission 
and Distribution 
U.S. Department of Energy 
1000 Independence Avenue SW 
Washington, DC 20585 

Dr. Nawal Kamel 

Special Assistant to the Deputy Minister 

Natural Resources Canada 

580 Booth Street 

Ottawa, ON 

K1A0E4 

Dear Mr. Glotfelty and Dr. Kamel: 

Enclosed is the Final Report of the Electric System Working Group (ESWG) supporting the 
United States - Canada Power System Outage Task Force. 

This report presents the results of an intensive and thorough investigation by a bi-national team 
into the causes of the blackout that occurred on August 14, 2003, and recommendations to 
prevent and reduce the scope of future blackouts. We believe that systematic implementation of 
these recommendations is critical to maintaining the reliability of bulk power supplies in North 
America. 

The report was written largely by the three co-chairs of the Electric System Working Group 
(David Meyer, Alison Silverstein, and Tom Rusnov), who also co-chaired the Task Force’s 
Electric System Investigation. They did so with the benefit of extensive input and assistance 
from many members of the investigation team. Other members of the ESWG reviewed the 
report in draft and provided valuable suggestions for its improvement. Those members join us in 
this submittal and have signed on the following page. 


Sincerely, 



David H. Meyer 
Senior Advisor 
U.S. Department 



Co-Chair, Electric 


of Energy 


System Working Group 


Senior Advisor 
Natural Resources 
Canada 

Co-Chair, Electric 
System Working Group 


to the Chairman 
Federal Energy Regulatory 
Commission 


Senior Energy Policy Advisor 


Co-Chair, Electric 
System Working Group 
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David Barrie 
Senior Vice President 
Asset Management 
Hydro One Inc. 


' ^ 

David Burpee, Direct^ 

Renewable and Electrical Energy Division 
Natural Resources Canada 


L 


Donald Downes, Chairman 
Connecticut Department of 
Public Utility Control 






Joseph Etb, Staff Scientist 
Lawrence Berkeley National Laboratory 
(U.S.), and Consortium for Electric 
Reliability Solutions 



7/1. Fa 


Jeanne Fox, President 

New Jersey Board of Public Utilities 


Blaine Loper, Senior Engineer 
Pennsylvania Public Utility Commission 



Chair, National Energy and Infrastructure 
Industry Group 

Gowlings, Lafleur, Henderson LLP 
Ontario 


David O’Brien, Commissioner 
Vermont Department of Public Service 



David O’Connor, Commissi;; 

Div. ofEnergy Resources/ 

Massachusetts Office of Consumer Affairs 
biness Regulatior 






Alan Schriber, Chairman 
Ohio Public Utilities Commission 



Senior VicePresident, Transmission 
New YorbPow^r Authqritv 

— —_ 

TTl eter Lark, Chairman 

Michigan Public Service Commission 



Gene Whitney, Policy Analyst // 

National Science and Technology Council 
U.S. Office of Science and Technology 
Policy, Executive Office of the President 


O U.S.-Canada Power System Outage Task Force *0 August 14th Blackout: Causes and Recommendations <0* 




















UNITED STATES 

NUCLEAR REGULATORY COMMISSION 


■ ■ Canadian Nuclear 

■ ^ ■ Safety Commission 


Commission canadienne 
de surete micteaire 


a 


WASHINGTON, D.C. 20555-0001 


President and 
Chief Executive Officer 


Presktenteet 
premiere dirigeante 


February 27, 2004 


Mr. James Glotfelty 
Director, Office of Electric 
Transmission and Distribution 
U.S. Department of Energy 
1000 Independence Ave., Suite 7B-222 
Washington, DC 20585 

Dr. Nawal Kamel 

Special Assistant to the Deputy Minister 

Natural Resources Canada 

580 Booth Street 

Ottawa, ON 

K1A 0E4 

Dear Mr. Glotfelty and Dr. Kamel: 

Enclosed for incorporation into the Task Force report are revisions to the Interim Report 
and possible recommendations submitted for consideration by the Nuclear Working Group 
supporting the United States - Canada Joint Power System Outage Task Force. The members 
of the Nuclear Working Group join us in this submittal and have signed on the attached pages. 

Please provide any comments related to the Canadian nuclear plants to either Mr. Pat 
Hawley (613-947-3992; hawleyp@cnsc-ccsn.gc.ca) or Mr. Mark Dallaire (613-947-0957; 
dallairem@cnsc-ccsn.gc.ca). Comments on the U.S. nuclear plants should be directed to either 
Mr. Cornelius Holden (301-415-3036; cfh@nrc.gov) or Mr. John Boska (301-415-2901; 
jpb1@nrc.gov). 


Sincerely, 




Chairman 

U.S. Nuclear Regulatory Commission 
U.S. Co-chair, Nuclear Working Group 


President and Chief Executive Officer 
Canadian Nuclear Safety Commission 
Canadian Co-chair, Nuclear Working Group 


Enclosures: Nuclear Working Group Signature Pages (2) 
Nuclear Working Group Final Report 
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cc w/encls: Mr. Ian Grant 

Director General, Reactor Power Regulation 
Canadian Nuclear Safety Commission 

Mr. Samuel J. Collins 

Deputy Executive Director, Reactor Programs 
U.S. Nuclear Regulatory Commission 
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The members of the Nuclear Working Group hereby submit this report as input to the United 
States - Canada Joint Power System Outage Task Force: 


Nils J. DiazHShairm^t 

U.S. Nuclear Regulatory Commission 
Co-chair, Nuclear Working Group 


Samuel J. Collins, Deputy Executive Director 

for Reactor Programs 

U.S. Nuclear Regulatory Commission 


William D. Magwood, IV, Director, Office of 
Nuclear Energy, Science and Technology 
U.S. Department of Energy 


Edward Wilds, Bureau of Air Management, 
Department of Environmental Protection 
(Connecticut) 



J. P&er Lark,\Chairman, Public Service 

Commission (Michigan) 



Jonnor, Commissioner, Division of 
jfesources, Office or Consumer 
Business Regulation 
(Massachusetts) 



Frederick F. Butler, Commissioner, New 
r Jersey Board of Public Utilities (New Jersey) 


Paul Eddy, Power Systems Operations 
Specialist, Public Service Commission (New 
York) 



David J. Allard, CHP, Director, Bureau of 
Radiation Protection, Department of 
Environmental Protection (Pennsylvania) 


Dr. G. Ivan Maldonado, Associate Professor, 
Mechanical, Industrial and Nuclear 
Engineering; University of Cincinnati (Ohio) 



David O’Brien, Commissioner 
Department of Public Service (Vermont) 
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The members of the Nuclear Working Group hereby submit this report as input to the United 
States - Canada Joint Power System Outage Task Force: 



President and Chief Executive Officer 
Canadian Nuclear Safety Commission 
Co-chair, Nuclear Working Group 






James^lyth 

tor-General, Directorate of Nuclear 
Substance Regulation 
Canadian Nuclear Safety Commission 


A kC V-ei-e 


v y~ci 


Ken Pereira 

Vice-President, Operations Branch 
Canadian Nuclear Safety Commission 



Dr. Robert Morrison 
Senior Advisor to the Deputy I 
Natural Resources Canada 


inister 



Chief Executive Officer 
Bruce Power 

(Representing the Province of Ontario) 
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Mr. James W. Glotfelty 
Director, Office of Electric Transmission 
and Distribution 
U.S. Department of Energy 
1000 Independence Avenue SW 
Washington, DC 20585 

Dr. Nawal Kamel 

Special Assistant to the Deputy Minister 

Natural Resources Canada 

580 Booth Street 

Ottawa, ON 

K1A 0E4 

Dear Mr. Glotfelty and Dr. Kamel: 

Enclosed is the Final Report of the Security Working Group (SWG) supporting the United States 
- Canada Power System Outage Task Force. 

The SWG Final Report presents the results of the Working Group's analysis of the security 
aspects of the power outage that occurred on August 14, 2003 and provides recommendations for 
Task Force consideration on security-related issues in the electricity sector. This report 
comprises input from public sector, private sector, and academic members of the SWG, with 
important assistance from many members of the Task Force’s investigative team. As co-chairs 
of the Security Working Group, we represent all members of the SWG in this submittal and have 
signed below. 


Sincerely, 




Infrastructure Protection, 

U.S. Department of Homeland Security 
Co-Chair, SWG 


Privy Council Office 
Government of Canada 
Co-Chair, SWG 


Security and Intelligence, 
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Attachment 1: 


U.S.-Canada Power System Outage Task Force SWG Steering Committee members: 


Bob Liscouski, Assistant Secretary for 
Infrastructure Protection, Department of Homeland 
Security (U.S. Government) (Co-Chair) 

William J.S. Elliott, Assistant Secretary to the 
Cabinet, Security and Intelligence, Privy Council 
Office (Government of Canada) (Co-Chair) 

U.S. Members 

Andy Purdy, Deputy Director, National Cyber Security 
Division, Department of Homeland Security 

Hal Hendershot, Acting Section Chief, Computer 
Intrusion Section, FBI 

Steve Schmidt, Section Chief, Special Technologies 
and Applications, FBI 

Kevin Kolevar, Senior Policy Advisor to the Secretary, 
DoE 

Simon Szykman, Senior Policy Analyst, U.S. Office of 
Science &Technology Policy, White House 

Vincent DeRosa, Deputy Commissioner, Director of 
Homeland Security (Connecticut) 

Richard Swensen, Under-Secretary, Office of Public 
Safety and Homeland Security (Massachusetts) 

Colonel Michael C. McDaniel (Michigan) 


Sid Caspersen, Director, Office of Counter-Terrorism 
(New Jersey) 

James McMahon, Senior Advisor (New York) 

John Overly, Executive Director, Division of Homeland 
Security (Ohio) 

Arthur Stephens, Deputy Secretary for Information 
Technology, (Pennsylvania) 

Kerry L. Sleeper, Commissioner, Public Safety 
(Vermont) 

Canada Members 

James Harlick, Assistant Deputy Minister, Office of 
Critical Infrastructure Protection and Emergency 
Preparedness 

Michael Devaney, Deputy Chief, Information 
Technology Security Communications Security 
Establishment 

Peter MacAulay, Officer, Technological Crime Branch 
of the Royal Canadian Mounted Police 

Gary Anderson, Chief, Counter-Intelligence - Global, 
Canadian Security Intelligence Service 

Dr. James Young, Commissioner of Public Security, 
Ontario Ministry of Public Safety and Security 
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